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MAMMALIAN FLAP-SPECIFIC 
ENDONUCLEASE 

TECHNICAL FIELD 

The invention provides novel polypeptides which are 5 
substantially identical to a naturally-occurring mammalian 
flap- specific endonuclease, polynucleotides encoding Such 
polypeptides, polynucleotide derivatives of naturally- 
occurring mammalian flap -endonuclease genes and MRNA, 
antibodies which are reactive with such polypeptides, poly- 30 
nucleotide hybridization probes and PCR amplification 
probes for detecting polynucleotides which encode such 
novel polypeptides, transgenes which encode such 
polypeptides, homologous targeting constructs that encode 
such polypeptides and/or homologously integrate in or near 15 
endogenous genes encoding such polypeptides, nonhuman 
transgenic animals which comprise functionally disrupted 
endogenous genes that normally encode such polypeptides, 
and transgenic nonhuman animals which comprise trans- 
genes encoding such polypeptides. The invention also pro- 20 
vides methods for detecting a pathological condition in a 
patient, methods and compositions for diagnostic polynucle- 
otide hybridization and/or amplification, methods for 
screening for antineoplastic agents and carcinogens, meth- 
ods for diagnostic staging of neoplasia, methods for pro- 25 
ducing recombinant flap endo nuclease for use as research or 
diagnostic reagents, methods for producing antibodies reac- 
tive with the novel polypeptides, and methods for producing 
transgenic nonhuman animals expressing the novel polypep- 
tides encoded by a transgenc. The invention also provides 30 
novel molecular cloning techniques and reagents involving 
cleavage of a flap or nick with a flap endonuclease. 

BACKGROUND 

35 

DNA can be damaged by a variety of environmental 
insults, including antitumor drugs, radiation, carcinogens, 
mutagens and other genotoxins. Chemical changes in the 
component nucleotides or of DNA secondary and tertiary 
structure which arise from such external causes are all 40 
considered herein to be DNA modification or damage. In 
addition, it is recognized that certain chemical and/or struc- 
tural modifications in DNA may occur naturally, and may 
play a role in, for example, DNA replication, expression, or 
the coordinate regulation of specific genes. It has been 4J5 
proposed that some types of DNA modification or damage 
arising from external sources are similar to, or even mimic, 
certain types of natural DNA chemical and/or structural 
modification. 

DNA damage can lead to mutations and cancer, as well as 50 
cell death; the latter is exploited in chemo- and radio- 
therapeutics. A better understanding of DNA chemical and 
structural modifications, including DNA damage, would 
also be helpful in that it might serve as the basis for 
developing an enhanced ability to repair or otherwise 55 
modify the effects of such damage, leading in turn to 
improved organismal or tissue resistance to DNA damaging 
agents. 

DNA Repair and Endonuclease Activity 6Q 

Nucleotide excision repair (NCR) is a major pathway by 
which damaged nucleotides are removed from DNA. The 
biochemical steps leading to the repair of damaged DNA 
bases include recognition of damage, incision and removal 
of the damaged strand, DNA synthesis to replace the excised 65 
nucleotides, and ligation. Genetic studies in yeast have 
identified seven repair genes that are absolutely required for 
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the initial DNA incisions to occur. These genes have been 
shown to encode nucleases (RAD 1/10 and RAD 2) 
(Habraken et al. (1993) Nature 366: 365; Tomkinson et al. 
(1993) Nature 362: 860), helicases (RAD3 and RAD25) 

5 (Sung et al. (1987) Proc. Natl. Acad. Sci. (USA) 84: 8951; 
Harosh et al. (1989)7. BioL Chem. 264: 20532; Park et al. 
(1992) Proc. Natl. Acad. Sci. (USA) 89: 11416), and a 
damage recognition protein (RAD 14). Current models indi- 
cate specific branched DNA structures at the site of DNA 

io damage in yeast. The resulting branched DNA structures 
may then be cleaved by the single-stranded endonucleases, 
RAbl/10 and RAD2. 

U.S. Pat. No. 5^59,047 reports a mammalian cellular 
factor that selectively recognizes and binds DNA damaged 

15 or modified by the anticancer drug, cis- 
diamminedichloroplatinum (II) (cisplatin). This DNA 
structure-specific recognition protein (SSRP) recognizes and 
selectively binds to a structural motif present in damaged 
DNA characteristic of DNA damaged by therapeutically 

20 active platinum compounds. 

U.S. Pal. No. 5324,830 reports a chimeric enzyme com- 
prised of an endo-exonuclease, RboNUC, from 5. cerexisiae 
which functions in both repair and recombination (Chow 
and Resnick (1987) J. Biol. Chem. 262: 17659; and Chow 

25 and Resnick (1988) Molec. Gen. Genet. 211: 41). Repair 
processes in the yeast Saccharomyces cerevisiae are under 
extensive genetic control involving over 50 genes; among 
these are genes that function in rccombinalional repair as 
well as normal meiotic and mitotic recombination (Kunz and 

30 Haynes (1981) Annu. Rev. Genet. 15: 57; Game, J. C. (1983) 
in: Yeast Genetics, Fundamental and Applied Aspects (eds. 
Spencer, J. F. X, Spencer, D., and Smith, A.) pgs. 109-137, 
Springer- Verlag New York, Inc., New York, and Resnick, M. 
A. (1987) in: Meiosis (ed. Moens, P.), pgs. 157-212, Aca- 

55 demic Press, New York). 

Nuclease activity associated with the Escherichia coli 
recBCD proteins is required for much of host recombination 
and also for chi stimulated lambda bacteriophage 
(Chaudhury and Smith (1984) Proc. Natl. Acad Sci. (USA) 
81: 7850). Holloman and Holliday (1973) J. Biol. Chem. 
248: 8107 have described nuclease alpha from the eucaryote 
Ustilago maydis that is required for recombination and DNA 
repair. An endo-exonuclease from Neurospora crassa has 
also been implicated in recombination and repair (Chow and 
" Fraser (1979) Can. J. Biochem. 57: 889; Chow and Fraser 
(1983) X Biol. Chem. 258: 12010; and Ramotar et al. (1987) 
J. Biol. Chem. 262: 425). The phenotypes of mutants defi- 
cient or altered in this specific nuclease activity include 

50 meiotic sterility and sensitivity to ultraviolet light, X-rays, 
and/or alkylating agents (Fraser, M. J., et al. (1990) in: DNA 
Repair and Mutagenesis in Eucaryotes (Generoso et al., eds) 
pgs. 63-74, Plenum Publishing Corp., New York). A similar 
endo-exonuclease has also been isolated from Aspergillus 

55 nidulans (Koa et al. (1990) Biochem. Cell. Biol. 68: 
387-392) and from mammalian mitochondria (Tomkinson 
et al. (1986) Nucl. Acids Res. 14: 9579. 

Recombination and Endonuclease Activity 

60 Homologous recombination (or general recombination) is 
defined as the exchange of homologous segments anywhere 
along a length of two DNA molecules. An essential feature 
of general recombination is that the enzymes responsible for 
the recombination event can presumably use any pair of 

65 homologous sequences as substrates, although some types of 
sequence may be favored over others. Both genetic and 
cytological studies have indicated that such a crossing-over 



process occurs between pairs of homologous chromosomes 
during mciosLs in higher organisms. 

Alternatively, in site -specific recombination, exchange 
occurs at a specific site, as in the integration of phage X into 
the E. coli chromosome and the excision of >. DNA from it. 5 
Site-specific recombination involves specific sequences of 
the phage DNA and bacterial DNA. Within these sequences 
there is only a short stretch of homology necessary for the 
recombination event, but not sufficient for it. The enzymes 
involved in this event generally cannot recombine other 10 
pairs of homologous (or nonhomologous) sequences, but act 
specifically on the particular phage and bacterial sequences. 

Although both site-specific recombination and homolo- 
gous recombination are useful mechanisms for genetic engi- 
neering of DNA sequences, targeted homologous recombi- ^ 
nation provides a basis for targeting and altering essentially 
any desired sequence in a duplex DNA molecule, such as 
targeting a DNA sequence in a chromosome for replacement 
by another sequence. Site-specific recombination has been 
proposed as one method to integrate transfected DNA at 2 o 
chromosomal locations having specific recognition sites 
(O 'Gorman et al. (1991) Science 251: 1351; Onouchi et al. 
(1991) Nucleic Acids Res. 19: 6373). Unfortunately, since 
this approach requires the presence of specific target 
sequences and recombinases, its utility for targeting recom- 2 5 
bination events at any particular chromosomal location is 
severely limited in comparison to targeted general recom- 
bination. 

For these reasons and others, targeted homologous recom- 
bination has been proposed for treating human genetic 30 
diseases. Human genetic diseases include: (1) classical 
human genetic diseases wherein a disease allele having a 
mutant genetic lesion is inherited from a parent (e.g., 
adenosine deaminase deficiency, sickle cell anemia, 
thalassemias), (2) complex genetic diseases like cancer, 35 
where the pathological state generally results from one or 
more specific inherited or acquired mutations, and (3) 
acquired genetic disease, such as an integrated provims 
(e.g., hepatitis B virus). However, current methods of tar- 
geted homologous recombination are inefficient and produce 40 
desired homologous recombinants only rarely, necessitating 
complex cell selection schemes to identify and isolate cor- 
rectly targeted recombinants. 

A primary step in homologous recombination is DNA 
strand exchange, which involves a pairing of a DNA duplex 45 
with at least one DNA strand containing a complementary 
sequence to form an intermediate recombination structure 
containing hctcroduplcx DNA (sec, Radding, C. M. (1982) 
Ann. Re\>. Genet. 16: 405; U.S. Pat. No. 4,888,274). The 
heteroduplex DNA may take several forms, including a 50 
triplex form wherein a single complementary strand invades 
the DNA duplex (Hsieh et al. (1990) Genes and Dew lop- 
ment 4: 1951) and, when two complementary DNA strands 
pair with a DNA duplex, a classical Holliday recombination 
joint or chi structure (Holliday, R. (1 964) Genet. Res. 5: 282) 55 
may form, or a double-D loop. Once formed, a heteroduplex 
structure may be resolved by strand breakage and exchange, 
so that all or a portion of an invading DNA strand is spliced 
into a recipient DNA duplex, adding or replacing a segment 
of the recipient DNA duplex. Alternatively, a heteroduplex 60 
structure may result in gene conversion, wherein a sequence 
of an invading strand is transferred to a recipient DNA 
duplex by repair of mismatched bases using the invading 
strand as a template {Genes, 3rd Ed. (1987) Lewin, B., John 
Wiley, New York, N.Y.; Lopez et al. (1987) Nucleic Acids 65 
Res. 15: 5643). Whether by the mechanism of breakage and 
rejoining or by the mechanism(s) of gene conversion, for- 



mation of heteroduplex DNA at homologously paired joints 
can serve to transfer genetic sequence information from one 
DNA molecule to another. 

The ability of homologous recombination (gene conver- 
5 sion and classical strand breakage/rejoining) to transfer 
genetic sequence information between DNA molecules 
makes targeted homologous recombination a powerful 
method in genetic engineering and gene manipulation. 
The ability of mammalian and human cells to incorporate 

10 exogenous genetic material into genes residing on chromo- 
somes has demonstrated that these cells have the general 
enzymatic machinery for carrying out homologous recom- 
bination required between resident and introduced 
sequences. These targeted recombination events can be used 

15 to correct mutations at known sites, replace genes or gene 
segments with defective ones, or introduce foreign genes 
into cells. The efficiency of such gene targeting techniques 
is related to several parameters: the efficiency of DNA 
delivery into cells, the type of DNA packaging (if any) and 

20 the size and conformation of the incoming DNA, the length 
and position of regions homologous to the target site (all 
these parameters also likely affect the ability of the incoming 
homologous DNA sequences to survive intracellular 
nuclease attack), the efficiency of recombination at particu- 

25 lar chromosomal sites and efficient and correct resolution 
(repair) of overlapped recombination joints and intermediate 
recombination structures. 

Unfortunately, exogenous sequences transferred into 

30 eukaryotic cells undergo homologous recombination with 
homologous endogenous sequences only at very low 
frequencies, and are so inefficiently recombined that large 
numbers of cells must be transfected, selected, and screened 
in order to generate a desired correctly targeted homologous 

, 5 recombinant (Kucherlapati et al. (1984) Proc. Natl Acad. 
Sci. (U.SA.)81: 3153; Smithies, 0.( 1985) Nature 317: 230; 
Song et al. (1987) Proc. Natl Acad. Sci. (USA.) 84: 6820; 
Doetschman et al. (1987) Nature 330: 576; Kim and Smith- 
ies (1988) Nucleic Acids Res. 16: 8887; Doetschman et al. 

40 (1988) op.cil.; Koller and Smithies (1989) op.cit.; Sheselv et 
al. (1991) Proc. Natl. Acad Sci. (USA.) 88: 4294; Kim el 
al. (1991) Gene 103: 227). 

Koller ct al. (1991) Proc. Natl. Acad. Sci. (USA.) 88: 
10730 and Snouwaert et al. (1992) Science 257: 1083, have 

45 described targeting of the mouse cystic fibrosis transmem- 
brane regulator (O I K) gene for the purpose of inactivating, 
rather than correcting, a murine CFTR allele. Koller et al. 
employed a large (7.8 kb) homology region in the targeting 
construct, but nonetheless reported a low frequency for 

50 correct targeting (only 1 of 2500 G4l8-resistant cells were 
correctly targeted). Thus, even targeting constructs having 
long homology regions are inefficiently targeted. 

Several proteins or purified extracts having the property 
of promoting homologous recombination (i.e., recombinase 

55 activity) have been identified in prokaryolcs and eukaryotes 
(Cox and Lehman (1987) Ann. Rev. Biochem. 56:229; 
Radding, C. M. (1982) op.cit.; Madiraju et al. (1988) Proc. 
Natl. Acad. Sci. (USA.) 85: 6592; ; McCarthy et al. (1988) 
Proc. Natl Acad. Sci. (U.SA.) 85: 5854; Lopez et al. (1987) 

60 op.cit., which are incorporated herein by reference). These 
general recombinases presumably promote one or more 
steps in the formation of homologously -paired 
intermediates, strand -exchange, gene conversion, and/or 
other steps in the process of homologous recombination. 

65 WO 93/22443 discloses the use of recA to promote homolo- 
gous recombination of gene correction and gene targeting 
vectors in vivo, including its use in gene therapy. 
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Thus, there exists a need in the art for compositions and 
methods of modulating DNA repair; compositions and 
methods for pharmaceutical development assays to identify 
agents that modulate DNA repair and cell proliferation; 
novel DNA repair and/or DNA replication and/or DNA 5 
recombination enzymes, compositions thereof, and encod- 
ing polynucleotides; and methods and compositions for 
using such enzymes to efficiently alter predetermined endog- 
enous genetic sequences by nonhomologous and/or homolo- 
gous recombination in vivo by introducing one or more 10 
exogenous targeting polynucleolide(s) that efficiently and 
specifically homologously pair with a predetermined endog- 
enous DNA sequence or nonhomologous! y integrate. There 
exists a need in the art for high -efficiency gene targeting and 
gene therapy, so that complex in vitro selection protocols 15 
(e.g., neo gene selection with G418) which are of limited 
utility for in vivo gene therapy on affected individuals, arc 
avoided. There also exists a need in the art for transgenic 
animals, such as knockout mice, which lack one or more 
DNA repair/DNA replication/DNA recombination enzymes; 20 
such mice can be sold as laboratory reagents and for 
pharmaceutical and lexicological testing. There also exists a 
need in the art for diagnostic methods employing novel 
DNA repair/DNA replication/DNA recombination enzymes 
and/or polynucleotides which encode such enzymes or sub- 25 
stantially identical sequence variants thereof. 

The references discussed herein are provided solely for 
their disclosure prior to the filing date of the present appli- 
cation. Nothing herein is to be construed as an admission 
that the inventors are not entitled to antedate such disclosure 30 
by virtue of prior invention. 

SUMMARY 

The present invention provides several novel methods and 
compositions relating to FI APcndonuclcascs, including but 35 
not limited to the human and murine FEN-1 FLAP endo- 
nuclease polypeptide and gene sequences, and novel dele- 
tion mutants of a Saccharomyces RAD2 gene and protein. 
These methods and compositions have a variety of 
applications, such as for modulating DNA repair/replication/ 40 
recombination activities, for performing a variety of 
molecular cloning techniques, for performing a variety of 
diagnostic assays, and for screening for modulators of FLAP 
endonuclease activities. These methods utilize polynucle- 
otide sequences encoding FLAP endonuclease proteins and 45 
polynucleotides which are substantially identical to 
naturally -occurring polynucleotide sequences (e.g., cDNA 
or genomic gene) that encode such FLAP endonuclease 
proteins. 

In one aspect of the invention, FLAP endonuclease 50 
polypeptides and compositions thereof are provided. In one 
embodiment, FLAP endonuclease polypeptides comprise 
polypeptide sequences which are substantially identical to a 
sequence shown in FIG. 1 (panel A) of FIG. 5, designated 
human FEN-1, or FIG. 2 (panel A), designated mouse 55 
FEN-1, or a cognate gene sequence in another species, 
typically mammalian, most usually rodent or primate. An 
example of a FON-1 polypeptide is the 380 amino acid long 
polypeptide of SEQ ID NO: 1 or the 378 amino acid long 
polypeptide of SEQ ID NO: 3. An example of a FEN-1 60 
polynucleotide is a polynucleotide having the nucleotide 
sequence of SEQ ID NO: 2 or SEQ ID NO: 4. Also provided 
are yeast FEN-1 polypeptides having endonuclease activity, 
such as the Saccharomyces FEN-1 polypeptide sequence 
shown in FIG. 3 (panel A) as SEQ ID NO: 5 and substan- 65 
tially identical polypeptides. Muteins, fragments, and other 
structural variants, polymorphic sequence alleles, including 



naturally -occurring allelic variants are also encompassed in 
the invention. Preferably, the FLAP cndonuclease polypep- 
tides and polynucleotides are isolated and/or substantially 
pure, or replicated, integrated, or expressed in a host species 
5 other than human or mouse cells or Saccharomyces cells. 
Preferably, FEN-1 polypeptides have a detectable endonu- 
clease activity, such as a 5' flap cleavage activity, which 
typically substantially lacks 3* flap cleavage activity and/or 
single-strand cleavage activity. 

io Polynucleotide sequences encoding FEN-1 polypeptides 
are provided. The characteristics of the cloned sequences are 
given, including the nucleotide and predicted amino acid 
sequences of mammalian FEN-1 in FIG. 1 (SEQ ID NOs: 1 
and 2), FIG. 2 (SEQ ID NOS: 3 and 4), of FIG. 5 and the 

! 5 nucleotide and predicted amino acid sequences of the yeast 
cognate gene to FEN-1, in FIG. 3 (SEQ ID NOS: 5 and 6). 
Polynucleotides comprising these sequences can serve as 
templates for the recombinant expression of quantities of 
FEN-1 polypeptides, such as human FEN-1 and mouse 

20 FEN-1, and variants thereof, such as muteins and the like. 
Polynucleotides comprising these sequences can also serve 
as probes for nucleic acid hybridization to detect the tran- 
scription rate and mRNA abundance of FEN-1 mRNA in 
individual lymphocytes (or other cell types) by in situ 

25 hybridization, and in specific cell populations by Northern 
blot analysis and/or by in situ hybridization (Alwine el 
al.(1977) Proc. Natl Acad. Sci. USA. 74: 5350) and/or PCR 
amplification and/or LCR detection. Such recombinant 
polypeptides and nucleic acid hybridization probes have 

30 utility for in vitro diagnostic methods for identification of 
genome instability and neoplasia or preneoplasia, for diag- 
nosis and treatment of pathological conditions and genetic 
diseases linked to the FEN-1 locus, and for forensic iden- 
tification of human individuals, for gene therapy of FEN-1 

35 deficiency conditions and neoplasia, among other uses 
apparent to those of skill in the art. 

In addition to polynucleotides which are substantially 
identical to all or a portion of a naturally-occurring mam- 
malian FEN-1 gene or mRNA, the invention provides poly- 

4t) nucleotides encoding a mammalian FEN-1 polypeptide. 
Such polynucleotides are provided with reference to the 
novel deduced polypeptide sequence information provided 
in FIGS. 1 and 2. Polynucleotides encoding mammalian 
FEN-1 polypeptides can be constructed by those skilled in 

45 the art on the basis of the disclosed SEQ ID NO: 1 and SEQ 
ID NO: 3 in view of the degeneracy of the genetic code. In 
an embodiment, the FEN-1 polynucleotides encode a full- 
length FEN-1 polypeptide of SEQ ID NO: 1 or SEQ ID NO: 
3. In an embodiment, the FEN-1 polynucleotides encode a 

50 mutein or analog of human or mouse FEN-1. FEN-1 poly- 
nucleotides can also encode fragments of a mammalian 
FEN-1 polypeptide, and/or fusion proteins comprising a 
full-length FEN-1 polypeptide or fragment or analog thereof 
in polypeptide linkage to a heterologous polypeptide (e.g., 

55 epitope tag, fi-galactosidase, immunoglobulin, glutathione- 
s-transferase, non-FLAP nuclease, polymerase, and the 
like). 

The invention also provides methods for producing a 
substantially purified FLAP endonuclease, such as a yeast or 

60 mammalian FEN-1 protein. Such methods typically com- 
prise expressing in a host cell a heterologous polynucleotide 
consisting of (1) a polynucleotide sequence encoding FEN-1 
operably linked to (2) a heterologous transcription regula- 
tory region (e.g., promoter and enhancer) capable of driving 

65 transcription of the linked FEN-1 sequence in the host cell 
to produce a mRNA which can be translated in the host cell 
into a FEN-1 polypeptide, preferably having FLAP endo- 
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nuclease activity. Typically, transcription control sequences 
in the heterologous polynucleotide include transcription 
termination sequences, polyadenylation sequences, and the 
like. The heterologous polynucleotide will also include 
sequences such that the transcribed RNA has a suitable 5 
ribosome binding site and untranslated sequences to ensure 
efficient translation in the host cell; as the host cell may be 
selected to be prokaryolic (e.g., E. coll) or cukaryotic (e.g., 
yeast, CHO cells, HeLa cells, etc.), the practitioner will 
select compatible transcription and translation control JQ 
sequences appropriate for use in the selected host cell. In an 
embodiment, the FEN-1 polypeptide expressed is in 
polypeptide linkage to a signal sequence to effect compart - 
menlalization and/or secretion from the host cell. 

The invention provides polynucleotides comprising a 15 
FEN-1 encoding sequence operably linked to a heterologous 
transcriptional regulatory sequence, such as for example a 
prokaryotic promoter or a eukaryotic promoter, and option- 
ally enhancer, which is not present adjacent to a naturally- 
occurring FEN-1 gene. For example and not limitation, 20 
suitable heterologous promoters include HSV tk promoter 
and SV40 large T antigen promoter/enhancer, among others. 

The invention also provides a method for producing 
substantially pure FEN-1 polypeptide having detectable 
endonuclease activity; the method comprises expressing a 25 
FEN-1 polynucleotide in a host cell under transcriptional 
control of a heterologous promoter whereby FEN-1 
polypeptide is expressed and collected in substantially puri- 
fied form. 

The invention also provides ARAD2 endonuclease 30 
polypeptides and compositions thereof, wherein said 
ARAD2 endonuclease polypeptides consist of a yeast RAD2 
polynucleotide substantially lacking a spacer (S) region, 
wherein the carboxyl -terminal portion of the N region is in 
polypeptide linkage with the I region with a truncated spacer 35 
of 0 to 25 amino acids, preferably a spacer of 14 amino 
acids, typically consisting of — QKRESAKS TARAR — 
(SEQ ID NO: 13). For example, the Saccharomyces ARAD 
polypeptide of FIG. 4 (panel A), having SEQ ID NO: 7, is 
a suitable ARAD2 endonuclease polypeptide having detect- 40 
able 5' flap cleavage activity. 

The invention also provides antisense polynucleotides 
complementary to polynucleotides encoding FEN-1 
polypeptide sequences, typically complementary to poly- 
nucleotide sequences which are substantially identical to a 45 
naturally -occurring mammalian FEN-1 gene sequence. Such 
antisense polynucleotides are employed to inhibit transcrip- 
tion and/or translation of the FEN-1 mRNA species and 
thereby effect a reduction in the amount of the respective 
FEN-1 polypeptide in a cell (e.g., a neoplastic cell of a 50 
patient). Such antisense polynucleotides can function as 
FEN-1 -modulating agents by inhibiting the formation of 
FEN-1 -required for DNA replication and/or repair of DNA 
damage (e.g., resulting from chemotherapy with a DNA- 
damaging agent such as bleomycin, cisplatin, nitrogen 55 
mustard, doxyrubicin, ionizing radiation, and the like) or 
maintenance of aneuploid genomes characteristic of neo- 
plastic variant cells. The antisense polynucleotides can pro- 
mote cell death in susceptible cells (e.g., cells requiring 
FEN-1 activity for DNA repair or replication). The FEN-1 60 
antisense polynucleotides are substantially identical to at 
least 25 contiguous nucleotides of the complementary 
sequence of the FEN-1 cDNA sequence shown in FIG. 1 
(panel B) and denoted SEQ ID NO: 2 or FIG. 2 (panel B) and 
denoted SEQ ID NO: 4. The FEN-1 antisense polynucle- 65 
otides are typically ssDNA, ssRNA, methylphosphonate 
backbone nucleic acids, phosphoro thiol ate backbone, polya- 
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mide nucleic acids, and the like antisense structures known 
in the art. In one aspect of the invention, an antisense 
polynucleotide is administered to inhibit transcription and/or 
translation of FEN-1 in a cell. 
5 In one embodiment, candidate therapeutic agents are 
identified by their ability to block the binding of a FEN-1 
polypeptide to a DNA flap substrate and/or block the endo- 
nucleolytic activity of a FEN-1 polypeptide to cleave a DNA 
flap substrate. The FEN-1 polypeptide preferably is a full- 

10 length mature FEN-1 protein. Typically, the FEN-1 polypep- 
tide comprises an amino acid sequence identical to a 
naturally -occurring mammalian FEN-1 protein sequence, 
although mutant FEN-1 polypeptides are sometimes used if 
the mutant FEN-1 polypeptide binds to and/or catalyzes 

35 cleavage of a DNA flap substrate under control assay 
conditions (e.g., physiological conditions). Agents are tested 
for their ability to alter binding and/or cleavage of DNA flap 
structures (or nicked DNA) by a FEN-1 polypeptide under 
suitable assay binding conditions. One means for detecting 

2o binding of a FEN-1 polypeptide to a DNA flap structure is 
to immobilize the DNA flap structure, such as by covalent or 
noncovalent chemical linkage to a solid support, often via a 
spacer sequence, and to contact the immobilized flap sub- 
strate with a FEN-1 polynucleotide that has been labeled 

25 with a detectable marker (e.g., by incorporation of radiola- 
beled amino acid, by epitope tagging and reporting with a 
fluorescent- labelled an ti -epitope tag antibody, and the like). 
Such contacting is typically performed in aqueous condi- 
tions which permit binding of a DNA flap substrate to a 

3D full-length human or mouse FEN-1 polypeptide. Binding of 
the labeled FEN-1 polypeptide to the immobilized DNA flap 
substrate is measured by determining the extent to which the 
labeled FEN-1 polypeptide is immobilized as a result of a 
specific binding interaction. Such specific binding may be 

35 reversible, or may be optionally irreversible if a cross- 
linking agent is added in appropriate experimental condi- 
tions. Alternatively, the DNA flap substrate may be labelled 
(e.g., by incorporation of a radiolabeled or biotinylated 
nucleotide) and the FEN-1 polypeptide immobilized. In one 

4 0 variation, the degree of enzymatic cleavage of the flap 
substrate by the FEN-1 polypeptide is quantitated, such as 
by release of labeled flap nucleotides from an immobilized 
flap substrate. Agents that inhibit or augment the formation 
of bound complexes (or flap cleavage activity) as compared 

45 to a control binding reaction (or flap cleavage reaction) 
lacking agent arc thereby identified as FEN-1 -modulating 
agents and are candidate therapeutic agents. 

In a variation of the invention, polynucleotides of the 
invention are employed for diagnosis of pathological con- 

50 ditions or genetic disease that involve neoplasia, aging, or 
other medical conditions related to FEN-1 function, and 
more specifically conditions and diseases that involve alter- 
ations in the structure or abundance of a FEN-1, or which are 
linked to a pathognomonic FEN-1 allele which can be 

55 detected by RFLP and/or allele -specific PCR. 

The invention also provides antibodies which bind to 
FEN-1 with an affinity of about at least lxlO 7 M" 1 and 
which lack specific high affinity binding for other mamma- 
lian proteins (e.g., albumin, DNA polymerase a). Such 

60 antibodies can be used as diagnostic reagents to identify 
cells exhibiting altered FEN-1 abundance or structure (e.g., 
preneoplastic or neoplastic cells) in a cellular sample from 
a patient (e.g., a lymphocyte sample, a solid tissue biopsy) 
as being cells which contain an increased amount of FEN-1 

65 polypeptide as compared to non-neoplastic cells of the same 
cell type(s). Frequently, anti-FEN-1 antibodies are included 
as diagnostic reagents for immunohistopathology staining of 



cellular samples in situ. Additionally, anti-FEN-l antibodies 
may be used therapeutically by targeted delivery to neoplas- 
tic cells (e.g., by cationization or by liposome/ 
immunoliposome delivery). 

The invention also provides FEN-1 polynucleotide probes 5 
for diagnosis of disease states (e.g., neoplasia or 
preneoplasia) by detection of a FEN-1 mRNAor rearrange- 
ments or amplification of the FEN-1 gene in cells explanted 
from a patient, or detection of a pathognomonic FEN-1 
allele (e.g., by RFLP or allele-specific PCR analysis). 50 
Typically, the detection will be by in situ hybridization using 
a labeled (e.g., 32 p, 35 S, 14 C, 3 H, fluorescent, biotinylated, 
digoxigeninylated) FEN-1 polynucleotide, although North- 
ern blotting, dot blotting, or solution hybridization on bulk 
RNA or poly A* RNA isolated from a cell sample may be 15 
used, as may PCR amplification using FEN-1 -specific prim- 
ers. Cells which contain an altered amount (typically a 
significant increase) of FEN-1 mRNA as compared to non- 
neoplastic cells of the same cell type(s) will be identified as 
candidate diseased cells. Similarly, the detection of pathog- 20 
nomonic rearrangements or amplification of the FEN-1 gene 
locus or closely linked loci in a cell sample will identify the 
presence of a pathological condition or a predisposition to 
developing a pathological condition (e.g., cancer, genetic 
disease). The polynucleotide probes are also used for foren- 25 
sic identification of individuals, such as for paternity testing 
or identification of criminal suspects or unknown decedents. 

The present invention also provides a method for diag- 
nosing a disease (e.g., neoplasia) in a human patient, 
wherein a diagnostic assay (e.g., immunohistochemical 30 
staining of fixed cells by an antibody that specifically binds 
human FEN-1) is used to determine if a predetermined 
pathognomonic concentration of FEN-1 polypeptide or its 
encoding mRNA is present in a biological sample from a 
human patient; if the assay indicates the presence of FEN-1 35 
polypeptide or its encoding mRNA outside of the normal 
range (e.g., beyond the predetermined pathognomonic 
concentration), the patient is diagnosed as having a disease 
condition or predisposition. 

The invention also provides therapeutic agents which 40 
inhibit neoplasia or apoptosis by modulating FEN-1 func- 
tion by inhibiting or augmenting formation of en/ymalically 
active FEN-1, such agents can be used as pharmaceuticals. 
Such pharmaceuticals will be used to treat a variety of 
human and veterinary diseases, such as: neoplasia, 45 
hyperplasia, neurodegenerative diseases, aging, AIDS, fun- 
gal infection, and the like. In an embodiment, the agent 
consists of a gene therapy vector encoding an enzymatically 
active FEN-1 polypeptide, or alternatively an cn/.ymatically 
inactive FEN-1 polypeptide which can competitively inhibit 50 
endogenous FEN-1 function. 

The invention also provides methods for identifying 
polypeptide sequences which bind to a FEN-1 polypeptides. 
For example, a yeast two-hybrid screening system can be 
used for identifying polypeptide sequences that bind to 55 
FEN-1. Yeast two-hybrid systems wherein one GAL4 fusion 
protein comprises a FEN-1 polypeptide sequence, typically 
a full-length or near full-length FEN-1 polypeptide sequence 
(e.g., a polypeptide sequence of FIGS. 1 or 2), and the other 
G AL4 fusion protein comprises a cDNA library member can 60 
be used to identify cDNAs encoding proteins which interact 
with the FEN-1 polypeptide, can be screened according to 
the general method of Chien et al. (1991) op.cit. 
Alternatively, an E. co/i/BCCP interactive screening system 
(Germino et al. (1993) Proc. Natl Acad. ScL {USA.) 90: 65 
933; Guarente L (1993) Proc. Natl. Acad. Set. (USA.) 90: 
1639, incorporated herein by reference) can be used to 
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identify interacting protein sequences. Also, an expression 
library, such as a Xgtll cDNA expression library, can be 
screened with a labelled FEN-1 polypeptide to identify 
cDNAs encoding polypeptides which specifically bind to the 

5 FEN-1 polypeptide. For these procedures, cDNA libraries 
usually comprise mammalian cDNA populations, typically 
human, mouse, or rat, and may represent cDNA produced 
from RNAof one cell type, tissue, or organ and one or more 
developmental stage. Specific binding for screening cDNA 

10 expression libraries is usually provided by including one or 
more blocking agent (e.g., albumin, nonfat dry milk solids, 
etc.) prior to and/or concomitant with contacting the labeled 
FEN-1 polypeptide (and/or labeled anti-FEN-1 polypeptide 
antibody). 

i5 In one aspect, the invention provides non-human animals 
(e.g., mice) which comprise a homozygous pair of function- 
ally disrupted endogenous FEN-1 alleles. Such functionally 
disrupted endogenous FEN -a alleles typically result from 
homologous gene targeting, and often comprise a na tu rally - 

20 occurring FEN-1 allele which is (1) disrupted by deletion of 
an essential structural sequence (e.g., exon) or regulatory 
sequence (e.g., promoter, enhancer, polyadenylation site, 
splice junction site) or (2) disrupted by integration of an 
exogenous polynucleotide sequence (e.g., neo* gene) into an 

25 essential structural sequence (e.g., exon) or regulatory 
sequence (e.g., promoter, enhancer, polyadenylation site, 
splice junction site). Such FEN-1 knockout animals can be 
sold commercially as test animals (e.g., as a preneoplastic 
animal for testing genotoxic and/or carcinogenic agents, 

30 such as a p53 knockout mouse or the Harvard 
OncoMouse™), bred to transfer the disrupted FEN-1 allele 
(s) into other genetic backgrounds, and sold as disease 
models for screening for therapeutic agents, for developing 
immunodeficient mice substantially lacking the capacity to 

35 undergo immunoglobulin VDJ rearrangement and/or isotype 
switching, recomb in at ion -deficient mice, and the like. Such 
knockout animals have a wide variety of utilities in addition 
to being diagnostic reagents to quantify genotoxicity of a 
compound or serve as radiosensitive animals, including 

40 serving as pets and sources of animal protein (e.g., as a 
foodstuff), among many other practical presently available 
uses. 

In one aspect of the invention, transgenic nonhuman 
animals, such as mice, bearing a transgene encoding a 

45 FEN-1 polypeptide arc provided. Such transgencs may be 
homologously recombined into the host chromosome or 
may be no n- homologously integrated. Such transgenes can 
often be present in a mouse lacking functional endogenous 
mouse FEN-1 (i.e., a FEN-1 knockout background), and 

50 typically such transgenes are human and can comprise the 
human FEN-1 gene, a human FEN-1 cDNA under transcrip- 
tional control of a mouse FEN-1 transcriptional regulatory 
region (e.g., at least 3-5 kb of 5" flanking sequence upstream 

^ of the mouse FEN-1 transcription start site. 

In an embodiment, the invention provides FEN-1 poly- 
nucleotides for gene therapy and compositions of such 
FEN-1 gene therapy vectors for treating or preventing 
disease. 

60 A further embodiment involves a polynucleotide (e.g., a 
DNA isolate) consisting essentially of a genomic DNA 
sequence encoding FEN-1 and more particularly a compo- 
sition consisting of cDNA molecules which encode the 
human FEN-1 protein. 

65 A further embodiment involves a polynucleotide (e.g., a 
DNA isolate) consisting essentially of a genomic DNA 
sequence encoding human FEN-1 and more particularly a 
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composition consisting of cDNA molecules which encode 
the FEN-1 protein. 

The invention also provides a novel diagnostic assay, 
comprising contacting a sample believed to potentially con- 
tain a predetermined target polynucleotide sequence (e.g., a 5 
target polynucleotide; analyte) with a probe polynucleotide 
capable of specific hybridization to all or a portion of said 
target polynucleotide under assay conditions, and forming as 
a result of the hybridization a 5* flap structure which can be 
cleaved by FEN-1 releasing nucleotides (or polynucleotides) 10 
in the flap strand; incubating with FEN-1 and detecting the 
release of nucleotides (or polynucleotides) of the flap strand, 
the release of nucleotides (or polynucleotides) thereby 
reporting the formation of a flap structure (or nicked DNA) 
and thereby reporting the presence, and optionally quantity, 15 
of the predetermined target polynucleotide sequence in the 
sample. Typically, the probe polynucleotide comprises two 
portions, *a tirst portion which hybridizes to the .target 
se quence andXSccond portion which i s adjacent to said first 
portion and whi ch form s the flap strand ; frequently an -U~ 
Sdjacent pol ynucleotide is present which hybri dizes to the 
portion of th e target po l ynucleotide immediately 5' to the 
por tion ol the target which hybridiz es jo_ the probe poly- 
nucleotide sequenc e. ThT portion of the prooe poiynucle- 
otide which forms the flap is typically labelled, and the 25 
entire probe may be labelled; in some embodiments, the 
target polynucleotide is 5' -end- labeled. Often, the probe 
polynucleotide is immobilized. The re lease of label in the 
presence of FEN-1 measures thc"abundance Ot target poiy- 
"nucleolide in the sample. For illustration and not limitation, 30 
--witrrTeierence to FIG. 6, a probe may correspond to the flap 
strand, a target polynucleotide may correspond to the bridge 
strand (F fcr ), and an adjacent strand may correspond to the 
F ad/ strand, as shown. Alternatively, a probe polynucleotide, 
typically labelled, may be immobilized via its 5' end such 35 
that hybridization to the target polynucleotide will form a 
cleavable flap which can be cleaved by FEN-1 releasing the 
cleaved portion of the probe polynucleotide which is hybrid- 
ized to the target; quantitating the amount of released label 
thereby detects the amount of target polynucleotide in the 40 
sample. The formation of the flap structure can also result 
from mismatch between the probe and the target, thus probes 
arc mismatched to the target can form flaps which arc 
cleaved by FEN-1, thereby serving to detect mismatches 
such as those indicating point mutants, small deletion or 45 
addition mutants, small inversion mutants, and the like, so 
long as the hybridization of the probe to the target generates 
a cleavable flap due to the sequence mismatch. 

A further understanding of the nature and advantages of 
the invention will become apparent by reference to the 50 
remaining portions of the specification and drawings. 

BRIEF DESCRIPHON OF THE DRAWINGS 

FIG. 1. Sequence of human FEN-1. Panel A: predicted 
amino acid sequence of human FEN-1. Panel B: Nucleotide 
sequence of coding portion of human FEN-1 cDNA. 

FIG. 2. Sequence of mouse FEN-1. Panel A: predicted 
amino acid sequence of mouse FEN-1. Panel B: Nucleotide 
sequence of coding portion of mouse FEN-1 cDNA. 60 

FIG. 3. Sequence of Saccharomyccs FEN-1. Panel A: 
predicted amino acid sequence of yeast FEN-1. Panel B: 
Nucleotide sequence of coding portion of yeast FEN-1 
cDNA. 

FIG. 4. Sequence of Saccbaromyces ARAD2. Panel A: 65 
amino acid sequence of yeast ARAD2. Panel B: Nucleotide 
sequence of coding portion of yeast ARAD2 DNA. 
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FIG. 5. Sequence and deduced an amino acid sequence of 
complete human FEN-1 cDNA, including untranslated por- 
tion. 

FIG. 6. DNA substrate for the flap cleavage. Nucleotide 

5 sequence for each of the three oligonucleotides which make 
up Flap Substrate 1 is shown. The flap strand (HJ42) was 5' 
end-labeled and annealed to the F frr (HJ41) and ¥ adj (HJ43) 
strands as described. The solid lines above and below this 
structure are shown to illustrate continuous strands. Follow- 

10 ing incubation of Flap Substrate 1 with protein extract, the 
reaction products are separated form the 34 nt input on a 
denaturing polyacrylamide gel. If cutting occurs at the 
elbow, for example, a 20 nt labeled fragment would be 
observed. Cutting proximal (hatched arrow) or distal (solid 

15 arrow) to the elbow would result in longer or shorter labeled 
products, respectively. 

FIG. 7. Identification of flap endonucleolytic cleavage 
activity in mouse lymphocytes and fibroblasts. Flap Sub- 
strate (FIG. 6) was incubated with varying amounts of 

20 nuclear extract in the presence of 0.5 //g sonicated salmon 
sperm DNA under standard FEN-1 en don ucl ease assay 
conditions as described. Lane 1, no extract; lanes 2 and 3, 40 
ng and 10 ng 1-8 pre-B nuclear extract, respectively; lanes 

4- 6, 160 ng, 40 ng and 10 ng NIH3T3 fibroblast nuclear 
25 extract, respectively. Reaction products were separated on a 

10% denaturing polyacrylamide gel and visualized by auto- 
radiography. The numbers on the left indicate the position 
and si /x o f oligonucleotide standards. 

30 ^DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

Unless defined otherwise, all technical and scientific 
terms used herein have the same meaning as commonly 

35 understood by one of ordinary skill in the art to which this 
invention belongs. Although any methods and materials 
similar or equivalent to those described herein can be used 
in the practice or testing of the present invention, the 
preferred methods and materials are described. For purposes 

40 of the present invention, the following terms are defined 
below. 

Definitions 

As used herein, the twenty conventional amino acids and 
45 their abbreviations follow conventional usage 
(Immunology — A Synthesis, 2nd Edition, E. S. Golub and D. 
R. Gren, Eds., Sinauer Associates, Sunderland, Mass. 
(1991), which is incorporated herein by reference). Stereoi- 
somers (e.g., D-amino acids) of the twenty conventional 
50 amino acids, unnatural amino acids such as ct,a- 
disubstituted amino acids, N-alkyl amino acids, lactic acid, 
and other unconventional amino acids may also be suitable 
components for polypeptides of the present invention. 
Examples of unconventional amino acids include: 
55 4-hydroxyproline, y-carboxyglulamatc, c-N,N,N- 
trimethyllysine, e-N-acetyllysine, O-phospboserine, 
N-acetylserine, N-formylmethionine, 3-methylhistidine, 

5- hydroxylysine, w-N-methylarginine, and other similar 
amino acids and imino acids (e.g., 4-hydroxyproline). In the 

60 polypeptide notation used herein, the lefihand direction is 
the amino terminal direction and the righthand direction is 
the carbo xy -terminal direction, in accordance with standard 
usage and convention. Similarly, unless specified otherwise, 
the leftband end of single-stranded polynucleotide 

65 sequences is the 5' end; the lefthand direction of double- 
stranded polynucleotide sequences is referred to as the 5' 
direction. The direction of 5' to 3* addition of nascent RNA 
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transcripts is referred to as the transcription direction; 
sequence regions on the DNA strand having the same 
sequence as the RNA and which are 5' to the 5' end of the 
RNA transcript are referred to as "upstream sequences"; 
sequence regions on the DNA strand having the same 5 
sequence as the RNA and which are 3' to the 3' end of the 
coding RNA transcript are referred lo as "downstream 
sequences". 

The term "naturally -occurring" as used herein as applied 
to an object refers to the fact that an object can be found in 10 
nature. For example, a polypeptide or polynucleotide 
sequence (hat is present in an organism (including viruses) 
that can be isolated from a source in nature and which has 
not been intentionally modified by man in the laboratory is 
naturally-occurring. Generally, the term naturally-occurring 15 
refers to an object as present in a non-pathological 
(undiseased) individual, such as would be typical for the 
species. 

As used herein, the term "FEN-1" generally refers to the 
mammalian FEN-1 gene and mammalian FEN-1 proteins, 20 
including isoforms thereof, unless otherwise identified; 
human and murine FEN-1 proteins and genes are preferred 
exemplifications of mammalian FEN-1, and in its narrowest 
usage FEN-1 refers to a FEN-1 polynucleotide and polypep- 
tide sequences having substantial identity to SEQ ID NO; 2 25 
or 4, or is at least 85 percent substantially identical to SEQ 
ID NO: 2 or 4, or is at least 89-95 percent substantially 
identical lo SEQ ID NO: 2 or 4. The term FEN-1 can also 
refer to the yeast FEN-1 of FIG. 3, as well as to fragments 
and muteins. 30 

The term "corresponds to" is used herein to mean that a 
polynucleotide sequence is homologous (i.e., is identical, 
not strictly evolutionarily related) to all or a portion of a 
reference polynucleotide sequence, or that a polypeptide 
sequence is identical to a reference polypeptide sequence. In 
contradistinction, the term "complementary to" is used 
herein to mean that the complementary sequence is homolo- 
gous to all or a portion of a reference polynucleotide 
sequence. For illustration, the nucleotide sequence 4(J 
"TATAC" corresponds lo a reference sequence "TATAC" 
and is complementary to a reference sequence "GTATA". 

The following terms are used lo describe the sequence 
relationships between two or more polynucleotides: "refer- 
ence sequence", "comparison window", "sequence 45 
identity", "percentage of sequence identity", and "substan- 
tial identity^*. A "reference sequence" is a defined sequence 
used as a basis for a sequence comparison; a reference 
sequence may be a subset of a larger sequence, for example, 
as a segment of a full-length cDNA or gene sequence given 50 
in a sequence listing, such as a polynucleotide sequence of 
FIG. 1, FIG. 2, FIG. 3, FIG. 4, or FIG. 5 or may comprise 
a complete cDNA or gene sequence. Generally, a reference 
sequence is at least 20 nucleotides in length, frequently at 
least 25 nucleotides in length, and often at least 50 nucle- 55 
otides in length. Since two polynucleotides may each (1) 
comprise a sequence (i.e., a portion of I he complete poly- 
nucleotide sequence) that is similar between the two 
polynucleotides, and (2) may further comprise a sequence 
that is divergent between the two polynucleotides, sequence 60 
comparisons between two (or more) polynucleotides are 
typically performed by comparing sequences of the two 
polynucleotides over a "comparison window" to identify 
and compare local regions of sequence similarity. 

A "comparison window", as used herein, refers to a 65 
conceptual segment of at least 25 contiguous nucleotide 
positions wherein a polynucleotide sequence may be com- 
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pared to a reference sequence of at least 25 contiguous 
nucleotides and wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions 
or deletions (i.e., gaps) of 20 percent or less as compared to 

5 the reference sequence (which docs not comprise additions 
or deletions) for optimal alignment of the two sequences. 
Optimal alignment of sequences for aligning a comparison 
window may be conducted by the local homology algorithm 
of Smith and Waterman (1981) Adv. Appl Math. 2: 482, by 

10 the homology alignment algorithm of Needleman and Wun- 
sch (1970) J. Mol. Biol. 48: 443, by the search for similarity 
method of Pearson and Lip man (1 988) Proc. Natl. Acad. Sci. 
(U.Sj\.) 85: 2444, by computerized implementations of 
these algorithms (GAP, BESTF1T, FASTA, and TFASTA in 

15 the Wisconsin Genetics Software Package Release 7.0, 
Genetics Computer Group, 575 Science Dr., Madison, Wis.), 
or by inspection, and the best alignment (i.e., resulting in the 
highest percentage of homology over the comparison 
window) generated by the various methods is selected. 

20 The term "sequence identity" means that two polynucle- 
otide sequences are identical (i.e., on a nucleotide-by- 
nucleotide basis) over the window of comparison. The term 
"percentage of sequence identity" is calculated by compar- 
ing two optimally aligned sequences over the window of 

25 comparison, determining the number of positions at which 
the identical nucleic acid base (e.g., A, T, C, G, U, or I) 
occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the 
total number of positions in the window of comparison (i.e., 

30 the window size), and multiplying the result by 100 to yield 
the percentage of sequence identity. The terms "substantial 
identity" as used herein denotes a characteristic of a poly- 
nucleotide sequence, wherein the polynucleotide comprises 
a sequence that has at least 80 percent sequence identity, 

35 preferably at least 85 percent identity and often 89 to 95 
percent sequence identity, more usually at least 99 percent 
sequence identity as compared to a reference sequence over 
a comparison window of at least 20 nucleotide positions, 
frequently over a window of at least 30-50 nucleotides, 

40 wherein the percentage of sequence identity is calculated by 
comparing the reference sequence to the polynucleotide 
sequence which may include deletions or additions which 
total 20 percent or less of the reference sequence over the 
window of comparison. The reference sequence may be a 

45 subset of a larger sequence, for example, as a segment of the 
full-length FEN-1 polynucleotide sequence shown in FIG. 1, 
FIG. 2, of FIG. 5, or a segment of a FEN-1 protein. 

As applied to polypeptides, the term "substantial identity" 
means that two peptide sequences, when optimally aligned, 

5D such as by the programs GAP or BESTFIT using default gap 
weights, share at least 80 percent sequence identity, prefer- 
ably at least 89 percent sequence identity, more preferably at 
least 95 percent sequence identity or more (e.g., 99 percent 
sequence identity). Preferably, residue positions which are 

55 not identical differ by conservative amino acid substitutions. 
Conservative amino acid substitutions refer to the inter- 
changeability of residues having similar side chains. For 
example, a group of amino acids having aliphatic side chains 
is glycine, alanine, valine, leucine, and isoleucine; a group 

60 of amino acids having a lip h at ic- hydroxy 1 side chains is 
serine and threonine; a group of amino acids having amide- 
containing side chains is asparagine and glutamine; a group 
of amino acids having aromatic side chains is phenylalanine, 
tyrosine, and tryptophan; a group of amino acids having 

65 basic side chains is lysine, arginine, and histidine; and a 
group of amino acids having sulfur-containing side chains is 
cysteine and methionine. Preferred conservative amino acids 
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substitution groups are: valine-leucine-isoleucine, 
phenylalanine-tyrosine, lysine -arginine, alanine-valine, and 
asp aragine -glut amine. 

The term "FEN-1 native protein" and "full-length FEN-1 
protein" as used herein refers to a full-length FEN-1 5 
polypeptide of 380 amino acids length consisting of SEQ ID 
NO: 1 or the 378 amino acid long polypeptide of SEQ ID 
NO: 3 or as naturally occurs in a mammalian species (e.g., 
mouse, human, simian, rat, etc.). A preferred FEN-1 native 
protein is the polypeptide corresponding to the deduced 10 
amino acid sequence shown in FIG. 1 (panel A) or FIG. 2 
(panel A) or corresponding to the deduced amino acid 
sequence of a cognate full-length FEN-1 CDNA of another 
species. Also for example, a native FEN-1 protein present in 
naturally -occurring somatic cells which express the FEN-1 l5 
gene are considered full-length FEN-1 proteins. 

The term "fragment" as used herein refers to a polypep- 
tide that has an amino -terminal and/or carboxy-terminal 
deletion, but where the remaining amino acid sequence is 
identical to the corresponding positions in the sequence 2 o 
deduced from a full-length encoding cDNA sequence (e.g., 
the cDNA sequence shown in FIGS. 1, 2, 3, or 5). Fragments 
typically are at least 14 amino acids long, preferably at least 
20 amino acids long, usually at least 50 amino acids long or 
longer, up to the length of a full-length naturally-occurring 2 5 
FEN-1 polypeptide (e.g., about 378-380 amino acids). 

The term "analog", "mutein" or "mutant" as used herein 
refers to polypeptides which arc comprised of a segment of 
al least 10 amino acids that has substantial identity to a 
portion of the naturally occurring protein. For example, a 30 
FEN-1 analog comprises a segment of at least 10 amino 
acids that has substantial identity to a FEN-1 protein, such 
as the FEN-1 protein of FIG. 1 (panel A) or FIG. 2 (panel A); 
preferably a deduced amino acid sequence of a mammalian 
FEN-1 cDNA. In an embodiment, a FEN-1 analog or mutein 35 
has at least one of the following properties: binding to a 5' 
DNA flap substrate and/or cleaving a 5' DN A flap substrate 
under suitable binding conditions. Typically, analog 
polypeptides comprise a conservative amino acid substitu- 
tion (or addition or deletion) with respect to the naturally- 40 
occurring sequence. Analogs typically are at least 20 amino 
acids long, preferably al least 50 amino acids long or longer, 
most usually being as long as full-length naturally-occurring 
protein (e.g., 378-380 amino acid residues for mouse and 
human FEN-1). Some analogs may lack biological activity 45 
(e.g., DNA 5' flap substrate binding or cleavage) but may 
still be employed for various uses, such as for raising 
antibodies to FEN-1 epitopes, as an immunological reagent 
to delect and/or purify u- FEN-1 antibodies by affinity 
chromatography, or as a competitive or noncompetitive 50 
agonist, antagonist, or partial agonist of native FEN-1 pro- 
tein function. 

The term "FEN-1 polypeptide" is used herein as a generic 
term to refer to native protein, fragments, or analogs of 
FEN-1, or such fused to a second polypeptide sequence 55 
(e.g., an epitope tag, |3-gal, or other fusion). Hence, native 
FEN-1, fragments of FEN-1, and analogs of FEN-1, as well 
as FEN-1 fusion proteins are species of the FEN-1 polypep- 
tide genus. Preferred FEN-1 polypeptides include: a murine 
full-length FEN-1 protein comprising the murine polypep- 60 
tide sequence shown in FIG. 2 (panel A), a full-length 
human FEN-1 protein comprising a polypeptide sequence 
encoded by a human FEN-1 cDNA of FIG. 1 (panel A), 
polypeptides consisting essentially of the sequence of FEN- 
1, and the naturally-occurring mouse and human FEN-1 65 
isoforms, including post- translation ally modified isoforms. 
Generally, FEN-1 polypeptides are less than 5,000 amino 
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acids long, usually less than 1000 amino acids long, often 
380 amino acids long or less. 

The term "FEN-1 polynucleotide" as used herein refers to 
a polynucleotide of at least 20 nucleotides wherein the 

5 polynucleotide comprises a segment of at least 20 nucle- 
otides which: (1) are at least 85 percent identical to a 
naturally-occurring FEN-1 mRNA sequence or its comple- 
ment or to a naturally -occurring FEN-1 genomic structural 
gene sequence, and/or (2) encode a FEN-1 polypeptide. Due 

10 to the degeneracy of the genetic code, some FEN-1 poly- 
nucleotides encoding a FEN-1 polypeptide will be less that 
85 percent identical to a naturally-occurring FEN-1 poly- 
nucleotide. Similarly, some FEN-1 polynucleotides which 
are suitable as hybridization probes, PCR primers, LCR 

35 amplimers, and the like will not encode a FEN-1 polypep- 
tide. 

The term "cognate" as used herein refers to a gene 
sequence that is evolutionarily and functionally related 
between species. For example but not limitation, in the 

20 human genome, the human CD4 gene is the cognate gene to 
the mouse CD4 gene, since the sequences and structures of 
these two genes indicate that they are highly homologous 
and both genes encode a protein which functions in signal- 
ing T cell activation through MHC class H-restricted antigen 

25 recognition. Thus, the cognate human gene to the murine 
FEN-1 gene is the human gene which encodes an expressed 
protein which has the greatest degree of sequence identity to 
the murine FEN-1 protein and which exhibits an expression 

5Q pattern similar to that of the murine FEN-1 (e.g., expressed 
in an equivalent tissue -specific expression pattern). Pre- 
ferred cognate FEN-1 genes are: rat FEN-1, rabbit FEN-1, 
canine FEN-1, nonhuman primate FEN-1, porcine FEN-1, 
bovine FEN-1, and hamster FEN-1. Cognate genes to FEN-1 
in non-mammalian species (e.g., C. elegans, avians, fish) 
can also be isolated. 

The term "agent" is used herein to denote a chemical 
compound, a mixture of chemical compounds, an array of 
spatially localized compounds (e.g., a VLSIPS peptide array, 

4Q polynucleotide array, and/or combinatorial small molecule 
array), a biological macromolecule, a bacteriophage peptide 
display library, a bacteriophage antibody (e.g., scFv) display 
library, a polysome peptide display library, or an extract 
made from biological materials such as bacteria, plants, 

45 fungi, or animal (particularly mammalian) cells or tissues. 
Agents are evaluated for potential activity as antineoplastics, 
anli-inflammatories, or apoptosis modulators by inclusion in 
screening assays described hereinbelow. Agents are evalu- 
ated for potential activity as specific protein interaction 

5Q inhibitors (i.e., an agent which selectively inhibits a binding 
interaction between two predetermined polypeptides but 
which does not substantially interfere with cell viability) by 
inclusion in screening assays described hereinbelow. 

The term "antineoplastic agent" is used herein to refer to 

55 agents that have the functional property of inhibiting a 
development or progression of a neoplasm in a human, 
particularly a lymphocytic leukemia, lymphoma or pre- 
leukemic condition. 

The term "FEN-1 antagonist" is used herein to refer to 

60 agents which inhibit FEN-1 activity and can produce a cell 
phenotype characteristic of cells having reduced or unde- 
tectable expression of FEN-1; FEN-1 antagonists typically 
will enhance cell death, especially in the presence of DNA- 
damaging agents. In contradistinction, FEN-1 agonists will 

65 enhance FEN-1 activity. 

As used herein, the terms "label" or "labeled" refers to 
incorporation of a detectable marker, e.g., by incorporation 
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of a radiolabeled amino acid or attachment to a polypeptide 
of biotinyl moieties that can be delected by marked avidin 
(e.g., streptavidin containing a fluorescent marker or enzy- 
matic activity that can be detected by optical or calorimetric 
methods). Various methods of labeling polypeptides and 
glycoproteins are known in the art and may be used. 
Examples of labels for polypeptides include, but are not 
limited to, the following: radioisotopes (e.g., 3 H, 14 C, 35 S, 
125 I, 13J I), fluorescent labels (e.g., FITC, rhodamine, lan- 
thanide phosphors), enzymatic labels (e.g., horseradish 
peroxidase, p-galactosidase, luciferase, alkaline 
phosphatase), biotinyl groups, predetermined polypeptide 
epitopes recognized by a secondary reporter (e.g., leucine 
zipper pair sequences, binding sites for secondary 
antibodies, transcriptional activator polypeptide, metal bind- 
ing domains, epitope tags). In some embodiments, labels arc 
attached by spacer arms of various lengths to reduce poten- 
tial steric hindrance. 

As used herein, "substantially pure" means an object 
species is the predominant species present (i.e., on a molar 
basis it is more abundant than any other individual macro- 
molecular species in the composition), and preferably a 
substantially purified fraction is a composition wherein the 
object species comprises at least about 50 percent (on a 
molar basis) of all macromolecular species present. 
Generally, a substantially pure composition will comprise 
more than about 80 to 90 percent of all macromolecular 
species present in the composition. Most preferably, the 
object species is purified to essential homogeneity 
(contaminant species cannot be detected in the composition 
by conventional detection methods) wherein the composi- 
tion consists essentially of a single macromolecular species. 
Solvent species, small molecules (<500 Daltons), and 
elemental ion species are not considered macromolecular 
species. 

An "isolated" polynucleotide or polypeptide is a poly- 
nucleotide or polypeptide which is substantially separated 
from other contaminants that naturally accompany it, e.g., 
protein, lipids, and other polynucleotide sequences. The 
term embraces polynucleotide sequences which have been 
removed or purified from their naturally-occurring environ- 
ment or clone library, and include recombinant or cloned 
DNA isolates and chemically synthesized analogues or 
analogues biologically synthesized by heterologous sys- 
tems. 

As used herein the terms "pathognomonic concentration", 
"pathognomonic amount", and "pathognomonic staining 
pattern" refer to a concentration, amount, or localization 
pattern, respectively, of a FEN-1 protein or mRNA in a 
sample, that indicates the presence of a pathological (e.g., 
neoplastic, senescent, immunodeficient, neurodegenerative, 
inflammatory, etc.) condition or a predisposition to devel- 
oping a neoplastic disease, such as carcinoma, sarcoma, or 
leukemia. A pathognomonic amount is an amount of a 
FEN-1 protein or FEN-1 mRNA in a cell or cellular sample 
that falls outside the range of normal clinical values that is 
established by prospective and/or retrospective statistical 
clinical studies. Generally, an individual having a neoplastic 
disease (e.g., carcinoma, sarcoma, or leukemia) will exhibit 
an amount of FEN-1 protein or mRNA in a cell or tissue 
sample that is outside the range of concentrations that 
characterize normal, undiseased individuals; typically the 
pathognomonic concentration is at least about one standard 
deviation outside the mean normal value, more usually it is 
at least about two standard deviations or more above the 
mean normal value. However, essentially all clinical diag- 
nostic tests produce some percentage of false positives and 
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false negatives. The sensitivity and selectivity of the diag- 
nostic assay must be sufficient to satisfy the diagnostic 
objective and any relevant regulatory requirements. In 
general, the diagnostic methods of the invention are used to 

5 identify individuals as disease candidates, providing an 
additional parameter in a differential diagnosis of disease 
made by a competent health professional. 

As used herein the term "physiological conditions" refers 
to temperature, pH, ionic strength, viscosity, and like bio- 

3 o chemical parameters which are compatible with a viable 
organism, and/or which typically exist intracellularly in a 
viable cultured yeast cell or mammalian cell. For example, 
the intracellular conditions in a yeast cell grown under 
typical laboratory culture conditions are physiological con- 

J5 ditions. Suitable in vitro reaction conditions for in vitro 
transcription cocktails are generally physiological condi- 
tions. In general, in vitro physiological conditions comprise 
50-200 mM NaCl or KC1, pH 6.5-8.5, 20°^5° C. and 
0.001-10 mM divalent cation (e.g., Mg~, Ca~); preferably 

20 about 150 mM NaCl or KC1, pH 7.2-7.6, 5 mM divalent 
cation, and often include 0.01-1.0 percent nonspecific pro- 
tein (e.g., BSA). A non-ionic detergent (Tween, NP-40, 
Triton X-100) can often be present, usually at about 0.001 to 
2% y typically 0.05-0.2% (v/v). Particular aqueous condi- 

25 tions may be selected by the practitioner according to 
conventional methods. For general guidance, the following 
buffered aqueous conditions may be applicable: 10-250 mM 
NaCl, 5-50 mM Tris HC1, pH 5-8, with optional addition of 
divalent cation(s) and/or metal chelators and/or nonionic 

30 detergents and/or membrane fractions and/or antifoam 
agents and/or scintillants. 

As used herein, the terms "interacting polypeptide seg- 
ment" and "interacting polypeptide sequence" refer to a 
portion of a hybrid protein which can form a specific binding 

35 interaction with a portion of a second hybrid protein under 
suitable binding conditions. Generally, a portion of the first 
hybrid protein preferentially binds to a portion of the second 
hybrid protein forming a heterodimer or higher order het- 
eromultimer comprising the first and second hybrid proteins; 

40 the binding portions of each hybrid protein are termed 
interacting polypeptide segments. Generally, interacting 
polypeptides can form hetcrodimcrs with a dissociation 
constant (K^) of at least about IxlO 3 M~\ usually at least 
lxlO 4 M" 1 , typically at least IxlO 5 M~\ preferably at least 

45 IxlO 6 M' 1 to IxlO 7 M~* or more, under suitable physi- 
ological conditions. 

The term * 4 recombinant" used herein refers to FEN-1 
polypeptides produced by recombinant DNA techniques 
wherein the gene coding for protein is cloned by known 

50 recombinant DNA technology. For example, the human gene 
for FEN-1 may be inserted into a suitable DNA vector, such 
as a bacterial plasmid, and the plasmid used to transform a 
suitable host. The gene is then expressed in the host to 
produce the recombinant protein. The transformed host may 

55 be prokaryotic or eukaryotic, including mammalian, yeast, 
Aspergillus and insect cells. One preferred embodiment 
employs bacterial cells as the host. 
Overview 

Deoxyribonuc leases play a role in many repair processes 
60 since they enable the excision of damaged DNA and provide 
a means for heteroduplex formation and processing in 
recombination. Several deoxyribonucleases have been 
shown both genetically and biochemically to function in 
recombination and repair. 
65 Nucleotide excision repair (NCR) is a major pathway by 
which damaged nucleotides are removed from DNA. The 
biochemical steps leading to the repair of damaged DNA 
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bases include recognition of damage, incision and removal 
of the damaged strand, DNA synthesis to replace the excised 
nucleotides, and ligation. Current models involve specific 
branched DNA structures at the site of damage. The result- 
ing branched DNA structures may then be cleaved by the 5 
single-stranded endonucleases, such as RAD2 or RAD 1/10 
in Saccharomyces. 

The design of branched DNA structures similar to those 
hypothesized in NER have allowed the purification in a 
mammalian DNA structure-specific endonuclease. This 10 
enzyme, called Flap endonuclease-1 or FEN-1 , cleaves DNA 
flap strands that terminate with 5' single-stranded ends (see 
structure in FIG. 6). DNA flap substrate 1 was designed to 
detect structure-specific endonucleases in mammalian cells, 
(FIG. 6). This substrate is a 5' flap structure because the flap js 
strand terminates with a 5' single-stranded end. Conversely, 
y flap structures have a flap strand that terminates with a 3' 
single-stranded end. Both 5' and 3' flap structures are com- 
posed of a flap strand, an F for (bridge) strand, and an ¥ adj 
(adjacent) strand. 20 

FEN-1 cleavage is flap strand specific and independent of 
flap strand length. Other branched DNA structures, includ- 
ing Holliday junctions, are not cleaved by FEN-1. In addi- 
tion to endonuclease activity, FEN-1 has a 5'-3' exonuclease 
activity that is specific for double-stranded DNA. The pre- 25 
ferred cut sites of FEN-1 are located 1 nucleotide proximal 
(hatched arrow) and 1 nucleotide distal (solid arrow) to the 
elbow of the flap strand. FEN-1 specifically cleaves 5' flap 
structures and nicked DNA but not 3' flap structures. Cleav- 
age of flap substrate 1 occurs primarily at 1 nucleotide 30 
proximal and 1 nucleotide distal to the elbow of the flap 
strand. Other 5' flap structures, however, are cleaved by 
FEN-1 primarily at 1 nucleotide proximal to the flap strand 
elbow. On the basis of these activities FEN-1 is believed to 
be involved in replicative and repair DNA synthesis through 35 
a nick translation mechanism. 

Nick- translation, the concerted action of DNA polymer- 
ization and degradation of an upstream primer, is an impor- 
tant reaction in DNA replication and repair. In E. colt, 
nick-translation is carried out by DNA polymerase I (Pol I). 40 
This complex enzyme is made up of a polymerase domain, 
a 3 -5* exonuclease proof-reading domain, and a 5 -3' exo- 
nuclease domain. Mutation of the 5'-3' exonuclease domain 
results in the inability of Pol I to carry out nick-translation. 
In vivo, this is manifested as a loss of cell viability due to 45 
the inability to remove the RNA primers on Okazaki frag- 
ments. Recently, the 5 -3' exonuclease domain of Pol I has 
been shown to have a structure-specific endonuclease 
domain that cleaves a branched DNA structure called a DNA 
flap (Lyamichev et al. 1993). This endonuclease activity 50 
may be important for the removal of some types of DNA 
damage. 

In eukaryotic cells, DNA polymerases do not have an 
intrinsic 5'-3' exonuclease domain. Nick-translation activity 
may be achieved by a 5 '-3' exonuclease that interacts with a 55 
DNA polymerase through protein-protein interactions. 

Nonhomologous recombination is a broad term used to 
describe a variety of DNA recombination reactions in which 
little or no sequence homology is used. Examples include 
chromosomal translocations, movements of retroviruses and 60 
transposable elements, developmental rearrangements of 
antibody and T-ccll receptor genes, and gene amplification. 
The products from these recombination events are the result 
of DNA breakage followed by DNA end -joining. Random 
DNA breakage can occur as a result of a variety of metabolic 65 
processes or by exogenous factors such as X-irradiation. 
Following breakage, DNA ends are joined very efficiently by 
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mammalian cells. Virtually any two ends, regardless of 
sequence or configuration, can be joined in the cell. Mam- 
malian cells have mechanisms which modify DNA ends, 
allowing them to be joined. 

5 DNA sequence analysis of end-joining products indicates 
that DNA ends can be modified and joined in a variety of 
ways. One important finding from these studies is that a 
major faction of end-joining events utilize short terminal 
homologies of 1-5 nucleotides (nt) in the resolution of DNA 

10 ends. 

Homology at DNA termini has also been implicated in the 
joining of codings ends during V(D)J recombination. V(D)J 
recombination is a site-specific recombination system in 
mammalian cells which directs the rearrangement of the 
15 antigen receptors of the immune system. This reaction is 
initiated by a site -specific recombinase which introduces 
double-stranded breaks at the recombination signal 
sequences. As in general DNA end-joining, the presence of 
homology is not required for joining to occur. 
20 In the template -directed ligation and post-repair ligation 
models of DNA end -joining, base -pairing of several termi- 
nal nucleotides can result in the formation of DNA flap 
structures. This structure may form as a result of limited 
exonuclease action at the DNA ends exposing single- 
25 stranded regions. Transient base-pairing of the exposed 
single-strands can then occur between the two ends. The 
resulting branched DNA structure, therefore, consists of 
both duplex DNA and unpaired displaced single-strands. In 
order to resolve this structure, the displaced single -strands 
30 must be removed before ligation can occur. It has not been 
clear, however, whether the displaced strands are removed 
endo- or exonucleolytically. 

The present invention provides FEN-1, a FLAP endonu- 
c lease involved in DNA repair, recombination, and replica - 
35 tion involving DNA flap intermediates or DNA:RNA flap 
intermediates. 
General Methods 

The nomenclature used hereafter and the laboratory pro- 
cedures in cell culture, molecular genetics, and nucleic acid 
40 chemistry and hybridization described below may involve 
well known and commonly employed procedures in the art. 
Standard techniques are used for recombinant nucleic acid 
methods, polynucleotide synthesis, and microbial culture 
and transformation (e.g., electroporation, lipofection). The 
45 techniques and procedures are generally performed accord- 
ing to conventional methods in the art and various general 
references (see, generally, Sambrook et al. Molecular Clon- 
ing: A Laboratory Manual, 2d cd. (1989) Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, N.Y., which 
50 is incorporated herein by reference) which are provided 
throughout this document. 

Oligonucleotides can be synthesized on an Applied Bio 
Systems oligonucleotide synthesizer according to specifica- 
tions provided by the manufacturer. 
55 Methods for PCR amplification are described in the art 
(PCR Technology: Principles and Applications for DNA 
Amplification ed. H A Hrlich, Freeman Press, New York, 
N.Y. (1992); PCR Protocols: A Guide to Methods and 
Applications, eds. Inn is, Gelfland, Snisky, and While, Aca- 
60 demic Press, San Diego, Calif. (1990); Manila et al. (1991) 
Nucleic Acids Res. 19: 4967; Eckert, K. A. and Kunkel, T. A. 
(1991) PCR Methods and Applications 1: 17; PCR, eds. 
Mcpherson, Quirkes, and Taylor, IRL Press, Oxford; and 
U.S. Pat. No. 4,683,202, which are incorporated herein by 
reference). 

Also incorporated herein by reference are: Harrington and 
Lieber (1994) Genes and Development 8: 1344; Harrington 
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and Lieber (1994) Tfie EMBOJ. 13: 1235; Hiraoka et al. 
(1995) Genomics 25: 220; and Harrington and Licbcr (1995) 
J. Biol. Chem. 270. 

FEN-1 Polypeptides and Polynucleotides 

Cloning of FEN-1 Polynucleotides 5 
Disclosure of the full coding sequences for mammalian 
FEN-1 shown in FIG. 1 and FIG. 2 makes possible the 
construction of isolated polynucleotides that can direct the 
expression of FEN-1, fragments thereof, or analogs thereof. 
Further, the sequences in FIG. 1, FIG. 2, and FIG. 5 make 30 
possible the construction of nucleic acid hybridization 
probes arid PCR primers that can be used to detect RNA and 
DNA sequences encoding FEN-1. 

Polynucleotides encoding full-length FEN-1 or fragments 
or analogs thereof, may include sequences that facilitate 15 
transcription (expression sequences) and translation of the 
coding sequences, such that the encoded polypeptide prod- 
uct is produced. Construction of such polynucleotides is 
well known in the art and is described further in Maniatis et 
al., Molecular Cloning: A Laboratory Manual, 2nd Ed. 20 
(1989), Cold Spring Harbor, N.Y. For example, but not for 
limitation, such polynucleotides can include a promoter, a 
transcription termination site (polyadenylation site in 
eukaryotic expression hosts), a ribosorne binding site, and, 
optionally, an enhancer for use in eukaryotic expression 25 
hosts, and, optionally, sequences necessary for replication of 
a vector. A typical eukaryotic expression cassette will 
include a polynucleotide sequence encoding a FEN-1 
polypeptide linked downstream (i.e., in translational reading 
frame orientation; polynucleotide linkage) of a promoter 
such as the HSV tk promoter or the pgk (phosphoglycerate 
kinase) promoter, optionally linked to an enhancer and a 
downstream polyadenylation site (e.g., an SV40 large T Ag 
poly A addition site). 

Preferably, these amino acid sequences occur in (he given 
order (in the amino-lerminal to carboxy-terminal 
orientation) and may comprise other intervening and/or 
terminal sequences; generally such polypeptides are less 
than 1000 amino acids in length, more usually less than ^ 
about 500 amino acids in lengths, and frequently approxi- 
mately 204 amino acids in length. ITie degeneracy of the 
genetic code gives a finite set of polynucleotide sequences 
encoding these amino acid sequences; this set of degenerate 
sequences may be readily generated by hand or by computer 45 
using commercially available software (Wisconsin Genetics 
Software Package Rclacs 7.0). Isolated FEN-1 polynucle- 
otides typically are less than approximately 10,000 nucle- 
otides in length. 

Additionally, where expression of a polypeptide is not 50 
desired, polynucleotides of this invention need not encode a 
functional protein. Polynucleotides of this invention may 
serve as hybridization probes and/or PCR primers 
(amplimers) and/or LCR oligomers for detecting FEN-1 
RNA or DNA sequences. 55 

Alternatively, polynucleotides of this invention may serve 
as hybridization probes or primers for detecting RNA or 
DNA sequences of related genes, such genes may encode 
structurally or cvolutionarily related proteins. For such 
hybridization and PCR applications, the polynucleotides of 60 
the invention need not encode a functional polypeptide. 
Thus, polynucleotides of the invention may contain substan- 
tial deletions, additions, nucleotide substitutions and/or 
transpositions, so long as specific hybridization or specific 
amplification to a FEN-1 sequence is retained. 65 

Genomic or CDNA clones encoding FEN-1 may be 
isolated from clone libraries (e.g., available from Clontech, 
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Palo Alto, Calif.) using hybridization probes designed on the 
basis of the nucleotide sequences shown in FIG. 1 and FIG. 
2 and FIG. 5 and using conventional hybridization screening 
methods (e.g., Benton W D and Davis R W (1977) Science 
5 196: 180; Goodspeed et al. (1989) Gene 76: 1). Where a 
CDNA clone is desired, clone libraries containing cDNA 
derived from somatic cell mRNAor other FEN-1 -expressing 
cell mRNA are preferred. Alternatively, synthetic polynucle- 
otide sequences corresponding to all or part of the sequences 
10 shown in FIG. 1, FIG. 2 and FIG. 5 may be constructed by 
chemical synthesis of oligonucleotides. Additionally, poly- 
merase chain reaction (PCR) using primers based on the 
sequence data disclosed in FIG. 1 and FIG. 2 may be used 
lo amplify DNA fragments from genomic DNA, mRNA 
15 pools, or from cDNA clone libraries. U.S. Pat. Nos. 4,683, 
195 and 4,683,202 describe the PCR method. Additionally, 
PCR methods employing one primer that is based on the 
sequence data disclosed in FIG. 1 or 2 and a second primer 
that is not based on that sequence data may be used. For 
20 example, a second primer that is homologous to or comple- 
mentary to a polyadenylation segment may be used. 

Provided in the invention are polynucleotides comprising 
a segment encoding a FEN-1 epitope or a multiplicity of 
FEN-1 epitopes. Preferred human FEN-1 epitopes are: 
25 — IQGLAKLIADVAPSAIRENDIK — (SEQ ID NO: 16); 

— SMSIYQFLIAVRQGGD — (SEQ ID NO: 17); 
— TSHLMGMFYRTIRMMENGIKPV— (SEQ ID NO: 

18); — GKPPQLKSGELAKRSERRAEAEKQ — (SEQ 

ID NO: 19); 

30 — EQEVEKFTKRLVKVTKQHND — (SEQ ID NO: 20); 
— LLSLMGIPYLDAPSEAEASCAALVK— (SEQ ID 
NO: 21); 

— LTFGSPVLMRHLTASEAKKLPIQ — (SEQ ID NO: 22); 
— ILQELGLNQEQFVDLCILLGS — (SEQ ID NO: 23); 
35 — RGIGPKRAVDMQKHKSIEEIVRR— (SEQ ID NO: 

24) ; — PENWLHKEAHQLFLEPEVLD— (SEQ ID NO: 

25) ; 

— WSEPNEEEIJKFMCGEKQFSEE— (SEQ ID NO: 26); 
— SKSRQGSTQGRLDDFFKVTGSL— (SEQ ID NO: 
40 27); 

and — KEPEPKGSTKKKAKTG — (SEQ ID NO: 28). 
Polynucleotides encoding epitopes having substantial iden- 
tity to these preferred epitopes are often employed. Such 
polynucleotides have a variety of uses, including as FEN-1 

45 probes, as templates for producing polypeptides comprising 
a FEN-1 epitope whereby such proteins are FEN-1 immu- 
nogens or commercial diagnostic reagents for standardizing 
a FEN-1 immunoassay, as polynucleotide vaccines 
(immunogens) when fused lo a secretory sequence for 

50 administering to an animal and making a-FEN-1 antisera 
and hybridomas; such polynucleotides can also be used as 
foodstuffs, combustible energy sources, and viscosity- 
enhancing solutes. 

55 Isolation of the Cognate FEN-1 Genes 

Mammalian homologs of the human and murine FEN-1 
gene or cDNA is identified and isolated by screening a 
human genomic or cDNA clone library, such as a human, rat, 
rabbit, or other genomic or cDNA library in yeast artificial 

60 chromosomes, cosmids, or bacteriophage >. (e.g., "k Charon 
35), with a polynucleotide probe comprising a sequence of 
about at least 24 contiguous nucleotides (or their 
complement) of the cDN A sequence shown in FIG. 1 (B) or 
FIG. 2 (B). Typically, hybridization and washing conditions 

65 are performed at high stringency according to conventional 
hybridization procedures. Positive clones are isolated and 
sequenced. For illustration and not for limitation, a full- 
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length polynucleotide corresponding to the sequence of FIG. 
1 (B) or FIG. 2 (B) may be labeled and used as a hybrid- 
ization probe to isolate genomic clones from a human or 
murine genomic clone library in XEMBL4 or XGEM11 
(Promega Corporation, Madison, Wis.); typical hybridiza- 5 
tion conditions for screening plaque lifts (Benton and Davis 
(1978) Science 196: 180) can be: 50% formamide, 5xSSC or 
SSPE, l-5xDenhardt*s solution, 0.1-1% SDS, 100-200 /<g 
sheared heterologous DNA or tRNA, 0-10% dextran sulfate, 
1x10 s to lxlO 7 cpm/ml of denatured probe with a specific JQ 
activity of about 1x10 s cpm//*g, and incubation at 42° 
C.-37° C. for about 6—36 hours. Prehybridization conditions 
are essentially identical except that probe is not included and 
incubation time is typically reduced. Washing conditions are 
typically l-3xSSC, 0.1-1% SDS, 50°-70° C. with change of 
wash solution at about 5-30 minutes. For isolating human 
FEN-1 polynucleotides with a mouse or human FEN-1 
polynucleotide probe, it is often preferred to hybridize at 
approximately 39° C. and to wash sequentially at the fol- 
lowing step temperatures: room temperature, 37° C, 39° C, 
42° C, 45° C, 50° C, 55° C, 60° C, 65° C, and 70° C, 
stopping after each step and monitoring the background 
probe signal (and optionally detecting signal by autoradio- 
gram and/or phosphor imaging, if radiolabeled probe is 
used) and terminating the washing steps when suitable 25 
signal/noise ratio is achieved, as determined empirically. 

Human and other mammalian FEN-1 cDNAs and 
genomic clones (i.e., cognate nonhuman genes) can be 
analogously isolated from various nonhuman cDNA and 
genomic clone libraries available in the art (e.g., Clontcch, 30 
Palo Alto, Calif.) by using probes based on the sequences 
shown in FIG. 1 (B) or FIG. 2 (B), with hybridization and 
washing conditions typically being less stringent than for 
isolation of FEN-1 human or mouse clones. 

Polynucleotides comprising sequences of approximately 35 
at least 30-50 nucleotides, preferably at least 100 
nucleotides, corresponding to or complementary to the 
nucleotide sequences shown in FIG. 1 (B) or FIG. 2(B) can 
serve as PCR primers and/or hybridization probes for iden- 
tifying and isolating germline genes corresponding to FEN- 40 
1. These germline genes may be human or may be from a 
related mammalian species, preferably rodents or primates. 
Such germline genes may be isolated by various methods 
conventional in the art, including, but not limited to, by 
hybridization screening of genomic libraries in bacterioph- 45 
age >. or cosmid libraries, or by PCR amplification of 
genomic sequences using primers derived from the 
sequences shown in FIG. 1 (B) or FIG. 2 (B). Human 
genomic libraries are publicly available or may be con- 
structed de novo from human DNA. 50 

It is apparent to one of skill in the art that nucleotide 
substitutions, deletions, and additions may be incorporated 
into the polynucleotides of the invention. Nucleotide 
sequence variation may result from sequence polymor- 
phisms of various FEN-1 alleles, minor sequencing errors, 55 
and the like. However, such nucleotide substitutions, 
deletions, and additions should not substantially disrupt the 
ability of the polynucleotide to hybridize to one of the 
polynucleotide sequences shown in FIG. 1 (B) or FIG. 2(B) 
under hybridization conditions that are sufficiently stringent 60 
to result in specific hybridization. 

Specific hybridization is defined herein as the formation 
of hybrids between a probe polynucleotide (e.g., a poly- 
nucleotide of the invention which may include substitutions, 
deletion, and/or additions) and a specific target polynucle- 65 
otide (e.g., a polynucleotide having the sequence in FIG. 1 
(B) or FIG. 2(B), wherein the probe preferentially hybridizes 
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to the specific target such that, for example, a single band 
corresponding to one or more of the RNA species of FEN-1 
(or alternatively spliced M RNA species) can be identified on 
a Northern blot of RNA prepared from a suitable cell source 
5 (e.g., a somatic cell expressing FEN-1). Polynucleotides of 
the invention and recombinantly produced FEN-1 , and frag- 
ments or analogs thereof, may be prepared on the basis of the 
sequence data provided in FIG. 1 (B) and FIG. 2 (B) and 
FIG. 5) according to methods known in the art and described 
in Maniatis et al., Molecular Cloning: A Laboratory 
Manual, 2nd Ed., (1989), Cold Spring Harbor, N.Y. and 
Berger and Kim me 1, Methods in Enzymology, Volume 152, 
Guide to Molecular Cloning Techniques (1987), Academic 
Press, Inc., San Diego, Calif., which are incorporated herein 
by reference. 

15 FEN-1 polynucleotides may be short oligonucleotides 
(e.g., 20-100 bases long), such as for use as hybridization 
probes and PCR (or LCR) primers. FEN-1 polynucleotide 
sequences may also comprise part of a larger polynucleotide 
(e.g., a cloning vector comprising a FEN-1 clone) and may 
20 be fused, by polynucleotide linkage, in frame with another 
polynucleotide sequence encoding a different protein (e.g., 
glutathione S -transferase or (3-galactosidase) for encoding 
expression of a fusion protein. Typically, FEN-ld polynucle- 
otides comprise at least 25 consecutive nucleotides which 
25 are substantially identical to a naturally -occurring FEN-1 
sequence (e.g., FIGS. 1 or 2), more usually FEN-1 poly- 
nucleotides comprise at least 50 to 100 consecutive nucle- 
otides which are substantially identical to a naturally- 
occurring FEN-1 sequence. However, it will be recognized 
30 by those of skill that the minimum length of a FEN-1 
polynucleotide required for specific hybridization to a 
FEN-1 target sequence will depend on several factors: G/C 
content, positioning of mismatched bases (if any), degree of 
uniqueness of the sequence as compared to the population of 
35 target polynucleotides, and chemical nature of the poly- 
nucleotide (e.g., methylphosphonate backbone, polyamide 
nucleic acid, phosphorolhiolale, etc.), among others. 

If desired, PCR amplimers for amplifying substantially 
full-length cDNA copies may be selected at the discretion of 
40 the practitioner. Similarly, amplimers to amplify single 
FEN-1 exons or portions of the FEN-1 gene (murine or 
human) may be selected. 

Each of these sequences may be used as hybridization 
probes or PCR amplimers to detect the presence of FEN-1 
45 mRNA, for example to diagnose a neoplastic disease char- 
acterized by the presence of an elevated or reduced FEN-1 
mRNA level in cells, or to perform tissue typing (i.e., 
identify tissues characterized by the expression of FEN-1 
mRNA), and the like. The sequences may also be used for 
50 detecting genomic FEN-1 gene sequences in a DNA sample, 
such as for forensic DNA analysis (e.g., by RFLP analysis, 
PCR product lcngth(s) distribution, etc.) or for diagnosis of 
diseases characterized by amplification and/or rearrange- 
ments of the FEN-1 gene. Alternatively, FEN-1 polynucle- 
55 otides can be used as a foodstuff, combustible energy source, 
viscosity-enhancing solute, and the like. In a variation of the 
invention, polynucleotides of the invention are employed for 
diagnosis of pathological conditions or genetic disease that 
involve neoplasia of other medical conditions related to 
60 FEN-1 function, and more specifically conditions and dis- 
eases that involve alterations in the structure or abundance 
of a FEN-1 polypeptide. 

For example and not limitation, the following pair of 
oligonucleotide primers can be used to amplify FEN-1 
65 polynucleotide sequences (e.g., CDNA) or as hybridization 
probes (e.g., as biotinylated or end-labeled oligonucleotide 
probes): 



25 

5'-AXGGGAAITCAAGGCCTGGCCAAACT-3' (SEQ ID 
NO: 14) and 

5-TTTATTTTCCCCTTTTAAACTTCCCTGC-3' (SEQ 
ID NO: 15). Other suitable PCR primers, LCR primers, 
hybridization probes, exon -specific hybridization probes 5 
and primers, degenerate oligonucleotides encoding FEN-1 
polypeptide sequences, and the like are apparent to those of 
skill in the art in view of FIG. 1, FIG. 2 and FIG. 5, and other 
FEN-1 sequences which can be obtained therewith. 

For example and not limitation, a FEN-1 polynucleotide ao 
can comprise the sequence: 

5'- ATG GG AATTCAAGG CCTGG CCAAACTAATTG CI - 
GATGTGGCCCCCAGTGCCATCCGGGAGA ATGA- 
CATCAAG AGCTACTTTGGCCGTAAGGTG - 
GCCATTG ATG CCTCTATG AGCATTTATC A 15 
GTTCCTGAJTGCTGTTCGCCAGGGTGGGGATGT- 
GCTGCAG AATG AG G AGGGTGAG ACCACC AGC- 
C A C C T G AT G G G C AT G T T C T A CCGCAC- 
CATTCGCATGATGGAGAACGGCATCAAGCCCG 
TGTATGTCTTTG ATG GCAAGCCGCCA- 20 
CAGCTCAAGTCAGGCGAGCTGGC- 
CAAACGCAGTGA GCGGCGGGCTGAGGCAG- 
AGAAGCAGCTG C AGC AGG CTCAGGCT- 
GCTGGGGCCGAGCAGGAG GTGGAAAAATTCAC- 
TAAGCGGCTGGTG AAGGTCACTAAGCAG- 25 
C A C A A T G A T G A G T G C A A A C 
ATCTGCTGAGCCTCATGGGCATCCCT- 
TAT C T T G AT GCACCCAGTGAGGCAGAG- 
GCCAGCTG TGCTGCCCTGGTGAAGGCTGGCA- 
A A G TCTATG C T G C G G CTA CCGAGGA- 30 
CATGGACTGCCTC ACCTTCGGCAGCCCTGT- 
G C TA AT GCGACACCTGACTGCCAGT- 
GAAGCCAAAAAGCTG CCAA 
TCCAGG AATTCCACCTG AGCCGG ATTCT- 
GCAGGAGCTGGGCCTGAACCAGGAACAGTTTGT 35 
GGATCTGTGCATCCTGCTAGGCAGTGAC- 
TACTGTG AGAGTATCCGGGGTA1TGGGCCCAAG 
CGGGCTGTGGACCTC AT CCAGAAGCA- 
CAAG AGCATCG AGGAGATCGTGCGGC- 
GACTTGACC CC AACAAGTACCCTGTGCC- 40 
AGAAAATTGGCTCCACAAGGAGGCTCAC- 
CAGCTCTTCTTGGA ACCTGAGGTGCTGGACCCA- 
G AGTCTGTGG AGCTG AAGTGGAGCG AGC- 
CAAATGAAGAAGAG 
C T G AT C A A G TT C AT GTGTGGTGAAAAG- 45 
CAGTTCTCTGAGGAGCGAATCCG- 
CAGTGGG GTCA AGAGGCTGAGTAAGAGC- 
C G CCA AG G C AG C A C CC A G G G CC G CCTG ■ 
G ATG ATTTCTTC AAG G T G ACCG G CTC ACTCTCT- 
TC AG CTA A G C G C A AG G A G C C AG A A C ■ 50 
CCAAGGGATCCACTAAGAAG 
AAGGCAAAGACTGGG- 
GCAGCAGGGAAGTTTAAAAGGGGAAAATAAA-3 , 
(SEO ID NO: 29) 
Also for example and not limitation, a FEN-1 polynuclc- 55 
otide can comprise one or more sequences selected from the 
group consisting of: 

5 -TG G G AATTC AA G G C CTG G CC AAA CTAATTG CT- 

GATGTGGCCCCCA-3' (SEQ ID NO: 30); 
5-TGACATCAAGAGCTACTTTGGCCGTAAGGTGGC- 60 

CA-3' (SEQ ID NO: 31); 
5 '-TG CCTCTATG AG C ATTTATC AGTTCCTG ATTG CT- 

GTT-3' (SEQ ID NO: 32); 
5' -GGATGTGCTGC AG AATG AGG AGGGTGAG ACCA- 

C-3' (SEQ ID NO: 33); 65 
5*-TGGGCATGTTCTACCGCACCATTCGCATGATGGA- 

GAACG-3* (SEQ ID NO: 34); 
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5-CTTTGATGGCAAGCCGCCACAGCTCAAGTCAG- 

GCGAGCTGG-3* (SEQ ID NO: 35); 
S'-AGCAGCTGCAGCAGGCTCAGGCTGCTGGGGCCO' 

(SEQ ID NO: 36); 
5 5 - AATTC ACTAAG CG G CTG GTG AAG GTC ACTAAG - 

CAG-3' (SEQ ID NO: 37); 
5-ATGATGAGTGCAAACATCTGCTGAGCCTCATG-3' 

(SEQ ID NO: 38); 

S'-ATCCCTTATCTTGATGCACCCAGTGAGGCAGAG- 
10 GCCA-3' (SEQ ID NO: 39); 

5-GCCCTGGTGAAGGCTGGCAAAGTCTATGCTGC- 

GGCTACCGAGGA-3' (SEQ ID NO: 40); 
5-CTTCGGCAGCCCTGTGCTAATGCGACACCTGA- 
C-3' (SEQ ID NO: 41); 

15 5-CAGGAATTCCACCTGAGCCGGATTCTGCAGGA- 
GCTG-3* (SEQ ID NO: 42); 
5 '-CCTG AACCAG G AACAG TTTGTG G ATCTGTG C A- 
TCCT-3' (SEQ ID NO: 43); 

5'-AGGCAGTGACTACTGTGAGAGTATCCGGGGTAT- 
20 TGGGCCCA-3' (SEQ ID NO: 44); 

5-GGCTGTGGACCTCATCCAGAAGCACAAGAGCA- 

TCGAGGA-3' (SEQ ID NO: 45); 
5-CAAGTACCCTGTGCCAGAAAATTGGCTCCACA- 

AGGAGGCT-3' (SEQ ID NO: 46); 
25 5-CTGAGGTGCTGGACCCAGAGTCTGTGGAGCTG- 

AAGTGG-3' (SEQ ID NO: 47); 
5 -GATCAAGTTCATGTGTGGTGAAAAGCAGTTCT- 

CTGAGGAGC-3" (SEQ ID NO: 48); 

5-ATCCGCAGTGGGGTCAAGAGGCTGAGTAAGAG- 
30 CCGCCA-3' (SEQ ID NO: 49); 

5-GCAGCACCCAGGGCCGCCTGGATGATTTCTTC-3' 
(SEQ ID NO: 50); 

5-CGGCTCACTCTCITCAGCTAAGCGCAAGGAGC- 
CA-3' (SEQ ID NO: 51); 

35 5-CCCAAGGGATCCACTAAGAAGAAGGCAAAGAC- 
TGGGGCAGC-3' (SEQ ID NO: 52). 
A preferred FEN-1 polynucleotide comprises all of these 
sequences in the given order, with or without spacer poly- 
nucleotides between the given sequences. Non-coding 

40 sequences of a FEN-1 polynucleotide, such as provided in 
FIG. 1 or FIG. 2, or equivalent non-mouse, non-human 
FEN-1 polynucleotide, can also be used. 

For example and not limitation, a FEN-1 polynucleotide 
can comprise the nucleotide sequence shown in FIG. 1 (SEQ 

45 ID NO: 2) or FIG. 2 (SEQ ID NO: 4). 

For example, a FEN-1 cDNA and/or genomic clone can 
be identified and isolated from a cDNA or genomic library, 
respectively, by hybridization of a labeled probe comprising 
the polynucleotide sequence of FIG. 1 (B) and/or FIG. 2 (B) 

50 or a pool of degenerate oligonucleotides encoding a segment 
of the polynucleotide sequence shown in FIG. 1 (A) or FIG. 
2 (A). Suitable hybridization conditions for specific hybrid- 
ization of these labeled probes to the mammalian FEN-1 
cDNAor gene can be established empirically by performing 

55 a series of hybridizations and/or washing steps at several 
temperatures and/or ionic strength conditions; for example 
and not limitation, hybridization conditions comprising 50% 
formamide, 5xSSC or SSPE, l-5xDenhardt's solution, 
0.1-1% SDS, 100-200 /<g sheared heterologous DNA or 

60 tRNA, 0-10% dextran sulfate, lxlO 5 to lxlO 7 cpra/ml of 
denatured probe with a specific activity of about lxlO 8 
epm//4j, and incubation at 42° C.-37° C. for about 6-36 
hours is often a suitable initial point. 

65 Antiscnsc Polynucleotides 

Additional embodiments directed to modulation of neo- 
plasia or cell death include methods that employ specific 
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antisense polynucleotides complementary to all or part of 
the sequences shown in FIG. 1 (B) or FIG . 2 (B) or a cognate 
mammalian FEN-1 sequence. Such complementary anti- 
sense polynucleotides may include nucleotide substitutions, 
additions, deletions, or transpositions, so long as specific 5 
hybridization to the relevant target sequence corresponding 
to FIG. 1 (B) or FIG. 2(B) is retained as a functional property 
of the polynucleotide. Complementary antisense polynucle- 
otides include soluble antisense RNA or DNA oligonucle- 
otides which can hybridize specifically to FEN-1 mRNA 10 
species and prevent transcription of the mRNA species 
and/or translation of the encoded polypeptide (Ching et al. 

(1989) Proc. Natl Acad. Sci. U.SA. 86: 10006; Broder et al. 

(1990) Ann. Int. Med. 113: 604; Loreau et al. (1990) FEBS 
Letters 274: 53; Holcenberg et al., W091/11535; U.S. Ser. is 
No. 07/530,165; WO91/09865; W09 1/04753; WO90/ 
13641; and EP 386563, each of which is incorporated herein 
by reference). The antisense polynucleotides therefore 
inhibit production of FEN-1 polypeptides. Antisense poly- 
nucleotides that prevent transcription and/or translation of 20 
mRNA corresponding to FEN-1 polypeptides may inhibit 
neoplasia, senescence, AIDS, and the like, and/or reverse the 
transformed phenotype of cells. Antisense polynucleotides 

of various lengths may be produced, although such antisense 
polynucleotides typically comprise a sequence of about at 25 
least 25 consecutive nucleotides which are substantially 
identical to a naturally-occurring FEN-1 polynucleotide 
sequence, and typically which are identical to a sequence 
shown in FIG. 1 (B), FIG. 2 (B), of FIG. 5 or a FEN-1 
sequence disclosed herein. 30 

Antisense polynucleotides may be produced from a het- 
erologous expression cassette in a transfectanl cell or trans- 
genic cell, such as a transgenic pluripotent hematopoietic 
stem cell used to reconstitute all or part of the hematopoietic 
stem cell population of an individual. Alternatively, the 35 
antisense polynucleotides may comprise soluble oligonucle- 
otides that are administered to the external milieu, either in 
the culture medium in vitro or in the circulatory system or 
interstitial fluid in vivo. Soluble antisense polynucleotides 
present in the external milieu have been shown to gain *u 
access to the cytoplasm and inhibit translation of specific 
MRNA species. In some embodiments the antisense poly- 
nucleotides comprise methylphosphonate moieties. For gen- 
eral methods relating to antisense polynucleotides, seeAnti- 
senseRNA and DNA, (1988), D. A. Melton, Ed., Cold Spring 45 
Harbor Laboratory, Cold Spring Harbor, N.Y.). 

Transgenic Animal Embodiments 
Genomic clones of FEN-1, particularly of the murine 
FEN-1 gene, may be used to construct homologous targeting 50 
constructs for generating cells and transgenic nonhuman 
animals having at least one functionally disrupted FEN-1 
allele. Guidance for construction of homologous targeting 
constructs may be found in the art, including: Rahemtulla et 
al. (1991) Nature 353: 180; Jasin et al. (1990) Genes Devel $s 
4: 157; Koh et al. (1992) Science 256: 1210; Molina el al. 
(1992) Nature 357: 161; Grusby et al. (1991) Science 253: 
1417; Bradley et al. (1992) Bio/Technology 10: 534, incor- 
porated herein by reference). Homologous targeting can he 
used to generate so-called "knockout" mice, which are 60 
heterozygous or homozygous for an inactivated FEN-1 
allele. Such mice may be sold commercially as research 
animals for investigation of immune system development, 
neoplasia, spermatogenesis, may be used as pets, may be 
used for animal protein (foodstuff), and other uses. $5 

Chimeric targeted mice are derived according to Hogan, 
el al., Manipulating the Mouse Embryo: A Laboratory 
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Manual, Cold Spring Harbor Laboratory (1988) and Tera- 
tocarcinomas and Embryonic Stem Cells: A Practical 
Approach, E. J. Robertson, ed., IRL Press, Washington, 
D.C., (1987) which are incorporated herein by reference. 

5 Embryonic stem cells are manipulated according to pub- 
lished procedures ( Teratocarcinomas and Embryonic Stem 
Cells: A Practical Approach, E. J. Robertson, ed., IRL Press, 
Washington, D.C (1987); Zjilstra el al. (1989) Nature 
342:435; and Schwartzberg et al. (1989) Science 246: 799, 

10 each of which is incorporated herein by reference). 

Additionally, a FEN-1 cDNAor genomic gene copy may 
be used to construct transgenes for expressing FEN-1 
polypeptides at high levels and/or under the transcriptional 
control of transcription control sequences which do not 

15 naturally occur adjacent to the FEN-1 gene. For example but 
not limitation, a constitutive promoter (e.g., a HSV-tk or pgk 
promoter) or a cell- lineage specific transcriptional regula- 
tory sequence (e.g., a CD4 or CD8 gene promoter/enhancer) 
may be operably linked to a FEN-1 -encoding polynucleotide 

20 sequence to form a transgene (typically in combination with 
a selectable marker such as a neo gene expression cassette). 
Such transgenes can be introduced into cells (e.g., ES cells, 
hematopoietic stem cells) and transgenic cells and trans- 
genic nonhuman animals may be obtained according to 

25 conventional methods. Transgenic cells and/or transgenic 
nonhuman animals may be used to screen for antineoplastic 
agents and/or to screen for potential carcinogens, as over- 
expression of FEN-1 or inappropriate expression of FEN-1 
may result in a preneoplastic or neoplastic state. 

30 

Production of FEN-1 Polypeptides 

The nucleotide and amino acid sequences shown in FIG. 
1, FIG. 2, and FIG. 5 enable those of skill in the art to 

35 produce polypeptides corresponding to all or part of the 
full-length mammalian FEN-1 polypeptide sequences. Such 
polypeptides may be produced in prokaryotic or eukaryotic 
host cells by expression of polynucleotides encoding FEN-1, 
or fragments and analogs thereof. Alternatively, such 

^ polypeptides may be synthesized by chemical methods or 
produced by in vitro translation systems using a polynucle- 
otide template to direct translation. Methods for expression 
of heterologous proteins in recombinant hosts, chemical 
synthesis of polypeptides, and in vitro translation are well 

45 known in the art and are described further in Maniatis et al., 
Molecular Cloning: A Laboratory Manual (1989), 2nd Ed., 
Cold Spring Harbor, N.Y. and Berger and Kimmel, Methods 
in Enzymology, Volume 152, Guide to Molecular Cloning 
Techniques (1987), Academic Press, Inc., San Diego, Calif. 

50 Fragments or analogs of FEN-1 may be prepared by those 
of skill in the art. Preferred amino- and carboxy-tcrmini of 
fragments or analogs of FEN-1 occur near boundaries of 
functional domains. For example, but not for limitation, 
such functional domains include domains conferring the 

55 property of binding to 5' DNA flap substrate and/or cleaving 
said flap substrate, and (2) conserved domains (e.g. an 
N-region, I-region, or C-region). 

One method by which structural and functional domains 
may be identified is by comparison of the nucleotide and/or 

60 amino acid sequence data shown in FIGS. 1 and 2 to public 
or proprietary sequence databases. Preferably, computerized 
comparison methods are used to identify sequence motifs or 
predicted protein conformation domains that occur in other 
proteins of known structure and/or function. For example, 

65 the NAD-binding domains of dehydrogenases, particularly 
lactate dehydrogenase and malate dehydrogenase, are simi- 
lar in conformation and have amino acid sequences that are 
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detectably homologous (Proteins, Structures and Molecular 
Principles, (1984) Creighton (ed.), W. H. Freeman and 
Company, New York, which is incorporated herein by 
reference). Further, a method to identify protein sequences 
that fold into a known three-dimensional structure are 5 
known (Bowie et al. (1991) Science 253: 164). Thus, the 
foregoing examples demonstrate that those of skill in the art 
can recognize sequence motifs and structural conformations 
that may be used to define structural and functional domains 
in the FEN-1 sequences of the invention. io 

Additionally, computerized comparison of sequences 
shown in FIG. 1 or FIG. 2 to existing sequence databases can 
identify sequence motifs and structural conformations found 
in other proteins or coding sequences that indicate similar 
domains of the FEN-1 protein. For example but not for 15 
limitation, the programs GAP, BESTFIT, FASTA, and 
TFASTA in the Wisconsin Genetics Software Package 
(Genetics Computer Group, 575 Science Dr., Madison, 
Wis.) can be used to identify sequences in databases, such as 
GenBank/EMBL, that have regions of homology with a 20 
FEN-1 sequences. Such homologous regions are candidate 
structural or functional domains. Alternatively, other algo- 
rithms are provided for identifying such domains from 
sequence data. Further, neural network methods, whether 
implemented in hardware or software, may be used to: (1) 25 
identify related protein sequences and nucleotide sequences, 
and (2) define structural or functional domains in FEN-1 
polypeptides (Brunak et al. (1991) J. MoL Biol 220: 49, 
which is incorporated herein by reference). 

Fragments or analogs comprising substantially one or 30 
more functional domain may be fused to heterologous 
polypeptide sequences, wherein the resultant fusion protein 
exhibits the functional property(ies) conferred by the FEN-1 
fragment. Alternatively, FEN-1 polypeptides wherein one or 
more functional domain have been deleted will exhibit a loss 35 
of the property normally conferred by the missing fragment. 

By way of example and not limitation, the domain(s) 
conferring the property of binding to 5' DNA flap structures 
may be fused to p-galactosidase to produce a fusion protein 
that can bind an immobilized 5* DNA flap substrate in a 40 
binding reaction and which can enzymatically cleave the 
5'-flap. 

Although one class of preferred embodiments are frag- 
ments having amino- and/or carboxy-termini corresponding 45 
to amino acid positions near functional domains borders, 
alternative FEN-1 fragments may be prepared. The choice of 
the amino- and carboxy-termini of such fragments rests with 
the discretion of the practitioner and will be made based on 
experimental con side ratio as such as ease of construction, 50 
stability to proteolysis, thermal stability, immunological 
reactivity, amino- or c a rboxyl- terminal residue modification, 
or other considerations. 

In addition to fragments, analogs of FEN-1 can be made. 
Such analogs may include one or more deletions or additions 55 
of amino acid sequence, either at the amino- or carboxy- 
termini, or internally, or both; analogs may further include 
sequence transpositions. Analogs may also comprise amino 
acid substitutions, preferably conservative substitutions. 
Additionally, analogs may include heterologous sequences 60 
generally linked at the amino- or carboxy-terminus, wherein 
the heterologous scquence(s) confer a functional properly to 
the resultant analog which is not indigenous to the native 
FEN-1 protein. However, FEN-1 analogs must comprise a 
segment of 25 amino acids that has substantial similarity to 65 
a portion of the amino acid sequences shown in FIG. 1 (B) 
or FIG. 2(B) or other mammalian FEN-1 proteins, 
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respectively, and which has at least one of the requisite 
functional properties (i.e., binds and/or cleaves 5' flap DNA 
substrates). Preferred amino acid substitutions are those 
which: (1) reduce susceptibility to proteolysis, (2) reduce 
5 susceptibility to oxidation, (3) alter post-translational modi- 
fication of the analog, possibly including phosphorylation, 
and (4) confer or modify other physicochemical or func- 
tional properties of such analogs. FEN-1 analogs include 
various muteins of a FEN-1 sequence other than the 
10 naturally-occurring peptide sequence. For example, single 
or multiple amino acid substitutions (preferably conserva- 
tive amino acid substitutions) may be made in the naturally- 
occurring FEN-1 sequence. 

Conservative amino acid substitution is a substitution of 
15 an amino acid by a replacement amino acid which has 
similar characteristics (e.g., those with acidic properties: 
Asp and GIu). A conservative (or synonymous) amino acid 
substitution should not substantially change the structural 
characteristics of the parent sequence (e.g., a replacement 
20 amino acid should not tend to break a helix that occurs in the 
parent sequence, or disrupt other types of secondary struc- 
ture that characterizes the parent sequence). Examples of 
art-recognized polypeptide secondary and tertiary structures 
are described in Proteins, Structures and Molecular 
25 Principles, (1984) Creighton (ed.), W. H. Freeman and 
Company, New York; Introduction to Protein Structure, 
(1991), C. Branden and J. Toozc, Garland Publishing, New 
York, N.Y; and Thornton et al. (1991) Nature 354: 105; 
which are incorporated herein by reference). 
30 SimUarly, full-length FEN-1 polypeptides and fragments, 
analogs, and/or fusions thereof can be made by those of skill 
in the art from the available FEN-1 gene, cDNA, and protein 
sequences. U.S. Pat. No. 5,279,952 discloses a method for 
using PCR to generate mutations (e.g., deletions) and chi- 
35 meric genes from known sequences. 

Methods used to produce human or mouse FEN-1 poly- 
nucleotides and polypeptides can also be modified by those 
of skill in the art to produce non-human FEN-1 polypep- 
tides. For example, a sequence of a yeast FEN-1 protein is 
40 shown in FIG. 3 (A). Similarly, full-length nonhuman 
FEN-1 polypeptides and fragments, analogs, and/or fusions 
thereof can be made by those of skill in the art from the 
nonhuman gene, cDNA, and protein sequences. 

Fusion proteins of FEN-1 can be made, such as fusions 
45 with a GAL4 activation domain or DNA-binding domain, 
and the like. 

Native FEN-1 proteins, fragments thereof, or analogs 
thereof can be used as reagents in binding assays to delect 
binding to 5' DNA flap substrates or endonucleolytic cleav- 

50 age thereof, for identifying agents that interfere with FEN-1 
function, said agents arc thereby identified as candidate 
drugs which may be used, for example, to block DNA repair 
following radiation or chemotherapy, to inhibit DNA repli- 
cation and cell replication, and/or to induce apoptosis (e.g., 

55 to treat lymphocytic leukemias, carcinomas, sarcomas, 
AIDS, neurodegenerative disease, senescence), and the like. 
FEN-1 is used in DNA flap binding reactions wherein one or 
more agents arc added arc performed in parallel with a 
control binding reaction that does not include an agent. 

60 Agents which inhibit the specific binding of FEN-1 to a 
DNA flap or cleavage thereof, as compared to a control 
reaction, are identified as candidate FEN-1 -modulating 
drugs. 

65 Peptidomimetics 

In addition to FEN-1 polypeptides consisting only of 
naturally-occurring amino acids,FEN-l peptidomimetics are 
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also provided. For example, pep tido mime tics can be suitable 
as drugs for competitive inhibition of endogenous FEN-1 
function, such as to inhibit DNA repair or replication in 
neoplastic cells. 

Peptide analogs are commonly used in the pharmaceutical 5 
industry as non-peptide drugs with properties analogous to 
those of the template peptide. These types of non-peptide 
compound are termed "peptide mimetics" or "peplidomi- 
metics" (Fauchere, J. (1986) Ad\\ Drug Res. 15: 29; Veber 
and Freidinger (1985) TINS p.392; and Evans et al. (1987) 10 
J. Med. Chem 30: 1229, which are incorporated herein by 
reference) and are usually developed with the aid of com- 
puterized molecular modeling. Peptide mimetics that are 
structurally similar to therapeutically useful peptides may be 
used to produce an equivalent therapeutic or prophylactic 15 
effect. Generally, peptidomimeucs are structurally similar to 
a paradigm polypeptide (i.e., a polypeptide that has a bio- 
logical or pharmacological activity), such as human FEN-1, 
but have one or more peptide linkages optionally replaced 
by a linkage selected from the group consisting of: ->u 
— CH 2 NH — , _-CH 2 S— , — CH- — ClU — , —CH=CU— " 
(cis and trans), — <TOCH 2 — , — CH(OH)CH 2 — , and 
— CH 2 SO — , by methods known in the art and further 
described in the following references: Spatola, A. F. in 
"Chemistry and Biochemistry of Amino Acids, Peptides, 25 
and Proteins/* B. Weinstein, eds., Marcel Dekker, New 
York, p. 267 (1983); Spatola, A. R, Vega Data (March 1983), 
Vol. 1, Issue 3, "Peptide Backbone Modifications" (general 
review); Morley, J. S., Trends Pharm Sci (1980) pp. 463-468 
(general review); Hudson, D. et al., Int J Pept Prot Res * 0 

(1979) 14:177-185 (— CH.NH— , CII,CH,— ); Spatola, A. 
F. et al., Life Sci (1986) 38:1243-1249 (— CH,— S); Hann, 
M. M.,7 Chem SocPerkin Trans / (1982) 307-314 ( — CH — 
CH — , cis and trans); Almquist, R. G. et al., J Med Chem 

(1980) 23:1392-1398 (-OOCH-—); Jennings- White, C. et « 
al., Tetrahedron Lett (1982) 23:2533 (— COCH 2 — ); Szelke, 
M. et al., European Appln. HP 45665 (1982) CA: 97:39405 
(1982) (— CH(OH)CH 2 — ); Holladay, M. W. et al., Tetra- 
hedron Lett (1983) 24:4401-4404 (— C(OH)CH,— ); and 
Hruby, V. J., Life Sci (1982) 31:189-199 (— CH 2 — S — ); 4U 
each of which is incorporated herein by reference. A par- 
ticularly preferred non-peptide linkage is _CH 2 NH — . 
Such peptide mimetics may have significant advantages over 
polypeptide embodiments, including, for example: more 
economical production, greater chemical stability, enhanced 45 
pharmacological properties (half-life, absorption, potency, 
efficacy, etc.), altered specificity (e.g., a broad-spectrum of 
biological activities), reduced antigenicity, and others, 
labeling of pept idti mi me tics usually involves covalenl 
attachment of one or more labels, directly or through a 50 
spacer (e.g., an amide group), to non -interfering position(s) 

on the peptidomimctic that arc predicted by quantitative 
structure-activity data and/or molecular modeling. Such 
non-interfering positions generally are positions that do not 
form direct contacts with the macromolecules(s) to which 55 
the peplidomimetie binds to produce the therapeutic effect. 
Derivitization (e.g., labelling) of peptidomimetics should 
not substantially interfere with the desired biological or 
pharmacological activity of the peptidomimctic. 

Systematic substitution of one or more amino acids of a 60 
consensus sequence with a D-amino acid of the same type 
(e.g., D-lysine in place of Mysine) may be used to generate 
more stable peptides. In addition, constrained peptides com- 
prising a consensus sequence or a substantially identical 
consensus sequence variation may be generated by methods 65 
known in the art (Rizo and Gierasch (1992) Ann. Rev. 
Biochem. 61: 387, incorporated herein by reference); for 
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example, by adding internal cysteine residues capable of 
forming intramolecular disulfide bridges which cyciizc the 
peptide. Cyclic peptides comprising a sequence of human 
FEN-1 frequently are preferred. 

5 Another embodiment involves the formation of FEN-1 
mutants wherein the native protein or fragment has at least 
one amino acid deleted or replaced by another amino acid 
and the mutants exhibits altered biological activity from the 
native protein or fragment. 

10 The amino acid sequences of FEN-1 polypeptides iden- 
tified herein will enable those of skill in the art to produce 
polypeptides corresponding to FEN-1 peptide sequences and 
sequence variants thereof. Such polypeptides may be pro- 
duced in prokaryotic or eukaryotic host cells by expression 

J5 of polynucleotides encoding a FEN-1 peptide sequence, 
frequently as part of a larger polypeptide. Alternatively, such 
peptides may be synthesized by chemical methods. Methods 
for expression of heterologous proteins in recombinant 
hosts, chemical synthesis of polypeptides, and in vitro 

20 translation are well known in the art and are described 
further in Maniatis et al., Molecular Cloning: A Laboratory 
Manual (1989), 2nd Ed., Cold Spring Harbor, N.Y.; Berger 
and Kim me 1, Methods in Enzymology, Volume 152, Guide to 
Molecular Cloning Techniques (1987), Academic Press, 

25 Inc., San Diego, Calif.; Merrifield, J. (1969) J. Am. Chem. 
Soc. 91: 501; Chaiken I. M. (1981) CRC Crit. Rev. Biochem. 
11: 255; Kaiser et al.( 1 989) Science 243: 187; Merrifield, B. 
(1986) Science 232: 342; Kent, S. B. H. (1988) Ami. Rev 
Biochem. 57: 957; and Ofiford, R. E. (1980) Semisynthetic 

30 Proteins, Wiley Publishing, which arc incorporated herein 
by reference). 

Peptides comprising a FEN-1 polypeptide sequence and 
peplidomimetics thereof can be produced, typically by direct 
chemical synthesis or recombinant expression, and used as 

35 agents to competitively inhibit endogenous FEN-1 activity. 
The peptides arc frequently produced as modified peptides, 
with nonpeptide moieties attached by covalent linkage to the 
N-terminus and/or C-terminus. In certain preferred 
embodiments, cither the carboxy-tcrrninus or the amino- 

40 terminus, or both, are chemically modified. The most com- 
mon modifications of the terminal amino and carboxyl 
groups are acetylation and amidalion, respectively. Amino- 
lerminal modifications such as acylation (e.g., acetylation) 
or alkylation (e.g., methylation) and c a rboxy- terminal modi- 

45 fications such as amidation, as well as other terminal 
modifications, including cyclization, may be incorporated 
into various embodiments of the invention. Certain amino- 
lerminal and/or carboxy- terminal modifications and/or pep- 
tide extensions to the core sequence can provide advanta- 

50 geous physical, chemical, biochemical, and 
pharmacological properties, such as: enhanced stability, 
increased potency and/or efficacy, resistance to serum 
proteases, desirable pharmacokinetic properties, and others. 
Such peptides or peptidomimetics may be used therapeuti- 

55 ca lly to treat disease by altering the process of DNA repair, 
replication, or recombination in a cell population of a 
patient. 

Production and Applications of a- FEN-1 Antibodies 
Native FEN-1 proteins, fragments thereof, or analogs 

60 thereof, may be used to immunize an animal for the pro- 
duction of specific antibodies. These antibodies may com- 
prise a polyclonal antiserum or may comprise a monoclonal 
antibody produced by hybridoma cells. For general methods 
to prepare antibodies, sec Antibodies: A Laboratory Manual, 

65 (1988) E. Harlow and D. Lane, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, N.Y., which is incorporated 
herein by reference. 
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For example but not for limitation, a recombinantly 
produced fragment of FEN-1 can be injected into a mouse 
along with an adjuvant following immunization protocols 
known to those of skill in the art so as to generate an immune 
response. Typically, approximately at least 1-50 ;/g of a 5 
FEN-1 fragment or analog is used for the initial 
immunization, depending upon the length of the polypep- 
tide. Alternatively or in combination with a recombinantly 
produced FEN-1 polypeptide, a chemically synthesized pep- 
tide having a FEN-1 sequence may be used as an immuno- 3 q 
gen to raise antibodies which bind a FEN-1 protein, such as 
the native FEN-1 polypeptide having the sequence shown 
essentially in FIG. 1 (A) or FIG. 2 (A), a native human 
FEN-1 polypeptide, a polypeptide comprising a FEN-1 
epitope, or a FEN-1 fusion protein. Immunoglobulins which ^ 
bind the recombinant fragment with a binding affinity of at 
least lxlO 7 M -1 can be harvested from the immunized 
animal as an antiserum, and may be further purified by 
immunoaffinily chromatography or other means. 
Additionally, spleen cells are harvested from the immunized 20 
animal (typically rat or mouse) and fused to myeloma cells 
to produce a bank of antibody-secreting bybridoma cells. 
The bank of hybridomas can be screened for clones that 
secrete immunoglobulins which bind the recombinantly- 
produced FEN-1 polypeptide (or chemically synthesized 2 5 
FEN-1 polypeptide) with an affinity of at least lxlO 6 M" 1 . 
Animals other than mice and rats may be used to raise 
antibodies; for example, goats, rabbits, sheep, and chickens 
may also be employed to raise antibodies reactive with a 
FEN-1 protein. Transgenic mice having the capacity to 30 
produce substantially human antibodies also may be immu- 
nized and used for a source of a-FEN-1 antiserum and/or for 
making monoclonal -secreting hybridomas. 

Bacteriophage antibody display libraries may also be 
screened for binding to a FEN-1 polypeptide, such as a 35 
full-length FEN-1 protein, a FEN-1 fragment, or a fusion 
protein comprising a FEN-1 polypeptide sequence compris- 
ing a FEN-1 epitope (generally at least 3-5 contiguous 
amino acids). Generally such FEN-1 peptides and the fusion 
protein portions consisting of FEN-1 sequences for screen- 40 
ing antibody libraries comprise about at least 3 to 5 con- 
tiguous amino acids of FEN-1, frequently at least 7 con- 
tiguous amino acids of FEN-1, usually comprise at least 10 
contiguous amino acids of FEN-1, and most usually com- 
prise a FEN-1 sequence of at least 14 contiguous amino 45 
acids as shown in FIG. 1 (A) or FIG. 2 (A). 

Combinatorial libraries of antibodies have been generated 
in bacteriophage lambda expression systems which may be 
screened as bacteriophage plaques or as colonies of lysogens 
(Huse et al. (1989) Science 246: 1275; Caton and Koprowski 50 
(1990) Proc. Natl. Acad. Sci. (U.SA.) 87: 6450; Mullinax et 
al (1990) Proc. Natl. Acad. Sci. (U.SA.) 87: 8095; Persson 
et al. (1991) Proc. Natl. Acad. Sci. (U.SA.) 88: 2432), 
Various embodiments of bacteriophage antibody display 
libraries and lambda phage expression libraries have been 55 
described (Kang et al. (1991) Proc. Natl. Acad. Set (USA.) 
88: 4363; Clackson et al. (1991) Nature 352: 624; McCaf- 
ferty et al. (1990) Nature 348: 552; Burton et al. (1991) 
Proc. Natl. Acad. Sci. (USA.) 88: 10134; Hoogenboom et 
al. (1991) Nucleic Acids Res. 19: 4133; Chang et al. (1991) 60 
J. Immunol. 147: 3610; Breitling et al. (1991) Gene 104: 
147; Marks ct al. (1991) J. Mol. Biol. 222: 581; Barbaset al. 
(1992) Proc. Natl Acad. Sci (U.SA.) 89: 4457; Hawkins 
and Winter (1992)7. Immunol. 22: 867; Marks et al. (1992) 
Biotecfmology 10: 779; Marks et al. (1992)7. Biol. Chem. 65 
267: 16007; Lowman et al (1991) Biochemistry 30: 10832; 
Lemer el al. (1992) Science 258: 1313, incorporated herein 
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by reference). Typically, a bacteriophage antibody display 
library is screened with a FEN-1 polypeptide that is immo- 
bilized (e.g., by covalent linkage to a chromatography resin 
to enrich for reactive phage by affinity chromatography) 
5 and/or labeled (e.g., to screen plaque or colony lifts). 

FEN-1 polypeptides which are useful as immunogens, for 
diagnostic detection of a-FEN-1 antibodies in a sample, for 
diagnostic detection and quantitation of FEN-1 protein in a 
sample (e.g., by standardized competitive ELISA), or for 

10 screening a bacteriophage antibody display library, are suit- 
ably obtained in substantially pure form, that is, typically 
about 50 percent (w/w) or more purity, substantially free of 
interfering proteins and contaminants. Preferably, these 
polypeptides are isolated or synthesized in a purity of at least 

15 80 percent (w/w) and, more preferably, in at least about 95 
percent (w/w) purity, being substantially free of other pro- 
teins of humans, mice, or other contaminants. 

For some applications of these antibodies, such as iden- 
tifying immunocrossreactive proteins, the desired antiserum 

20 or monoclonal antibody(ies) is/are not monospecific. In 
these instances, it may be preferable to use a synthetic or 
recombinant fragment of FEN-1 as an antigen rather than 
using the entire native protein. More specifically, where' the 
object is to identify immunocrossreactive polypeptides that 

25 comprise a particular structural moiety, such as a DNA 
flap-binding domain, it is preferable to use as an antigen a 
fragment corresponding to part or all of a commensurate 
structural domain in the FEN-1 protein. Production of 
recombinant or synthetic fragments having such defined 

30 amino- and car boxy -termini is provided by the FEN-1 
sequences shown in FIG. 1 (A) and FIG. 2 (A). 

If an antiserum is raised to a FEN-1 fusion polypeptide, 
such as a fusion protein comprising a FEN-1 immunogenic 
epitope fused to p-galactosidase or glutathione 

35 S -transferase, the antiserum is preferably pre adsorbed with 
the non -FEN-1 fusion partner (e.g, (3-galactosidase or glu- 
tathione S- transferase) to deplete the antiserum of antibodies 
that react (i.e., specifically bind to) the non-FEN-1 portion 
of the fusion protein that serves as the immunogen. Mono- 

40 clonal or polyclonal antibodies which bind to the human 
and/or murine FEN-1 protein can be used to detect the 
presence of human or murine FEN-1 polypeptides in a 
sample, such as a Western blot of denatured protein (e.g., a 
nitrocellulose blot of an SDS-PAGE) obtained from a lym- 

45 phocyte sample of a patient. Preferably quantitative detec- 
tion is performed, such as by denisto metric scanning and 
signal integration of a Western blot. The monoclonal or 
polyclonal antibodies will bind to the denatured FEN-1 
epitopes and may be identified visually or by olher optical 

50 means with a labeled second antibody or labeled Staphylo- 
coccus aureus protein A by methods known in the art. 

One use of such antibodies is to screen cDNA expression 
libraries, preferably containing cDNA derived from human 
or murine mRNA from various tissues, for identifying 

55 clones containing cONA inserts which encode structurally- 
related, immunocrossreactive proteins, that are candidate 
novel FEN-1 binding factors or F"EN-1 -related proteins. 
Such screening of cDNA expression libraries is well known 
in the art, and is further described in Young et al., Proc. Natl. 

60 Acad. Sci. USA. 80:1194-1198 (1983), which is incorpo- 
rated herein by reference) as well as other published sources. 
Another use of such antibodies is to identify and/or purify 
immunocrossreactive proteins that are structurally or evo- 
lutionary related to the native FEN-1 protein or to the 

65 corresponding FEN-1 fragment (e.g., functional domain) 
used to generate the antibody. The anti-FEN-1 antibodies of 
the invention can be used to measure levels of FEN-1 protein 
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in a cell or cell population, for example in a cell explant 
(e.g., lymphocyte sample) obtained from a patient. The 
anti-FEN-1 antibodies can be used to measure the corre- 
sponding protein level by various methods, including but not 
limited to: (1) standardized ELISA on cell extracts, (2) 5 
immunoprecipitation of cell extracts followed by polyacry- 
lamide gel electrophoresis of the immunoprecipitated prod- 
ucts and quantitative detection of the band(s) corresponding 
to FEN-1, and (3) in situ detection bv immunohistochemical 
straining with the anti-FEN-1 antibodies and detection with 
a labeled second antibody. The measurement of the FEN-1 10 
protein level in a cell or cell population is informative 
regarding the replicative status and/or DNA damage status 
of the cell or cell population. 

Various other uses of such antibodies arc to diagnose 
and/or stage leukemias or other neoplasms, and for thera- 15 
peutic application (e.g., as cationized antibodies or by tar- 
geted liposomal delivery) to treat neoplasia, autoimmune 
disease, AIDS, and the like. 

An antiserum which can be utilized for this purpose can 
be obtained by conventional procedures. One exemplarv 20 
procedure involves the immunization of a mammal, such as 
rabbits, which induces the formation of polyclonal antibod- 
ies against FEN-1. Monoclonal antibodies are also being 
generated from already immunized hamsters. This antibody 
can be used to detect the presence and level of the FEN-1 2* 
protein. 

It is also possible to use the proteins for the immunologi- 
cal detection of FEN-1 and associations thereof with stan- 
dard assays as well as assays using markers, which are 
radioimmunoassays or enzyme immunoassays. 30 

The detection and determination of FEN-1 has significant 
diagnostic importance. For example, the detection of pro- 
teins conferring DNA repair capacity or replication capabil- 
ity would be advantageous in cancer therapy and controlling 
hypertrophies, as well as staging the degree of genomic „ 
instability in dysplastic variants of aggressive neoplasms. " 
The detection or determination of FEN-1 proteins will be 
beneficial in staging senescence and immunodcficicncv 
disease, including HIV-1, II and 111, and in neurodegenera*- 
tive and ischemic cell death. Thus these FEN-1 proteins and 
their antibodies can be employed as a marker to monitor, 40 
check or detect the course of disease. 
Identification and Isolation of Proteins That Bind FEN-1 

Proteins that bind to FEN-1 are potentially important 
regulatory proteins. Such proteins may be targets for novel 
antineoplastic agents or anti-inflammatory agents, immuno- 45 
modulatory agents, and the like. These proteins arc referred 
to herein as accessory proteins. Accessory proteins may be 
isolated by various methods known in the art. For example 
and not limitation, such accessory proteins may be DNA 
polymerases, ADP-ribosy It ransf erases, purine/pyrimidine so 
deglycosylases, and the like. 

One preferred method of isolating accessory proteins is by 
contacting a FEN-1 polypeptide to an antibody that binds the 
FEN-1 polypeptide, and isolating resultant immune com- 
plexes. These immune complexes may contain accessory 55 
proteins bound to the FEN-1 polypeptide. The accessory 
proteins may be identified and isolated by denaturing the 
immune complexes with a denaturing agent and, preferably, 
a reducing agent. The denatured, and preferably reduced, 
proteins can be electrophoresed on a polyacryl amide gel. 60 
Putative accessory proteins can be identified on the poly- 
acrylamidc gel by one or more of various well known 
methods (e.g., Coomassie staining, Western blotting, silver 
staining, etc.), and isolated by resection of a portion of the 
polyacrylamide gel containing the relevant identified 65 
polypeptide and elution of the polypeptide from the gel 
portion. 
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A putative accessory protein may be identified as an 
accessory protein by demonstration that the protein binds to 
FEN-1 specifically. Such binding may be shown in vitro by 
various means, including, but not limited to, binding assays 
5 employing a putative accessory protein that has been rena- 
tured subsequent to isolation by a poly aery lam ide gel elec- 
trophoresis method. Alternatively, binding assays employing 
recombinant or chemically synthesized putative accessory 
protein may be used. For example, a putative accessory 

10 protein may be isolated and all or part of its amino acid 
sequence determined by chemical sequencing, such as 
Ed man degradation. The amino acid sequence information 
may be used to chemically synthesize the putative accessory 
protein. The amino acid sequence may also be used to 

15 produce a recombinant putative accessory protein by: (1) 
isolating a cDNA clone encoding the putative accessory 
protein by screening a cDNA library with degenerate oligo- 
nucleotide probes according to the amino acid sequence 
data, (2) expressing the cDNAin a host cell, and (3) isolating 

20 the putative accessory protein. Alternatively, a polynucle- 
otide encoding a FEN-1 polypeptide may be constructed by 
oligonucleotide synthesis, placed in an expression vector, 
and expressed in a host cell. 

Putative accessory proteins that bind FEN-1 complexes in 

25 vitro are identified as accessory proteins. Accessory proteins 
may also be identified by crosslinking in vivo with bifunc- 
tional crosslinking reagents (e.g., dimethylsuberimidate, 
glutaraldehyde, etc.) and subsequent isolation of crosslinked 
products that include a FEN-1 polypeptide. For a general 

30 discussion of cross-linking, see Kunkel et al. (1981) Mol. 
Cell Biochem. 34: 3, which is incorporated herein by 
reference. Preferably, the Afunctional crosslinking reagent 
will produce crosslinks which may be reversed under spe- 
cific conditions after isolation of the crosslinked complex so 

35 as to facilitate isolation of the accessory protein from the 
FEN-1 polypeptide. Isolation of crosslinked complexes that 
include a FEN-1 polypeptide is preferably accomplished by 
binding an antibody that binds a FEN-1 polypeptide with an 
affinity of at least lxlO 7 M" 1 to a population of crosslinked 

40 complexes and recovering only those complexes that bind to 
the antibody with an affinity of at least lxlO 7 M" J . Polypep- 
tides that are crosslinked to a FEN-1 polypeptide are iden- 
tified as accessory proteins. 

Also, an expression library, such as a Xgtll cDNAexpres- 

45 sion library (Dunn et al. (1989)7. Biol Chem. 264: 13057), 
can be screened with a labelled FEN-1 polypeptide to 
identify cDNAs encoding polypeptides which specifically 
bind to the FEN-1 polypeptide. For these procedures, cDNA 
libraries usually comprise mammalian cDNA populations, 

50 typically human, mouse, or rat, and may represent cDNA 
produced from RNA of one cell type, tissue, or organ and 
one or more developmental stage. Specific binding for 
screening cDNA expression libraries is usually provided by 
including one or more blocking agent (e.g., albumin, nonfat 

55 dry milk solids, etc.) prior to and/or concomitant with 
contacting the labeled FEN-1 polypeptide (and/or labeled 
anti-FEN-1 antibody). 

Screening assays can be developed for identifying candi- 
date antineoplastic agents as being agents which inhibit 

60 binding of FEN-1 to an accessory protein under suitable 
binding conditions. 

Yeast Two-Hybrid Screening Assays 

An approach to identifying polypeptide sequences which 
65 bind to a predetermined polypeptide sequence has been to 
use a so-called "two-hybrid" system wherein the predeter- 
mined polypeptide sequence is present in a fusion protein 
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(Chien et al. (1991) Proc. Natl. Acad. Sci. (USA) 88: 9578). 
This approach identifies protein-protein interactions in vivo 
through reconstitution of a transcriptional activator (Fields S 
and Song O (1989) Nature 340: 245), the yeasi Gal4 
transcription protein. Typically, the method is based on the 5 
properties of the yeast Gal4 protein, which consists of 
separable domains responsible for DNA-binding and tran- 
scriptional activation. Polynucleotides encoding two hybrid 
proteins, one consisting of the yeast Gal4 DNA-binding 
domain fused to a polypeptide sequence of a known protein 10 
and the other consisting of the Gal4 activation domain fused 
to a polypeptide sequence of a second protein, are con- 
structed and introduced into a yeast host cell. Intermolecular 
binding between the two fusion proteins reconstitutes the 
Gal4 DNA-binding domain with the Gal4 activation 15 
domain, which leads to the transcriptional activation of a 
reporter gene (e.g., lacZ, HIS3) which is operably linked to 
a Gal4 binding site. Typically, the two-hybrid method is used 
to identify novel polypeptide sequences which interact with 
a known protein (Silver S C and Hunt S W (1993) Mol. Biol ->o 
Rep. 17: 155; Durfee et al. (1993) Genes De\>eL 7; 555; Yang 
et al. (1992) Science 257: 680; Luban el al. (1993) Cell 73: 
1067; Hardy et al. (1992) Genes Devel. 6; 801; Bartel et al. 
(1993) Biotechniques 14: 920; and Vojtek et al. (1993) Cell 
74: 205). However, variations of the two-hybrid method 25 
have been used to identify mutations of a known protein that 
affect its binding to a second known protein (Li B and Fields 
S (1993) FASEBJ. 7: 957; Lalo et al. (1993) Proc. Natl. 
Acad. Sci. (USA) 90: 5524; Jackson et al. (1993) Mol. Cell. 
Biol. 13; 2899; and Madura et al. (1993)7. Biol. Chem. 268: 30 
12046). Two-hybrid systems have also been used to identify 
interacting structural domains of two known proteins 
(Bardwell et al. (1993) med. Microbiol. 8: 1177; 
Chakraborty et al. (1992) J. Biol. Chem. 267: 17498; 
Slaudingerct al. (1993)7. Biol. Chem. 268: 4608; and Milne 35 
G T and Weaver D T (1993) Genes Devel. 7; 1755) or 
domains responsible for oligomerizalion of a single protein 
(Iwabuchi et al. (1993) Oncogene 8; 1693; Bogerd et al. 
(1993)7. ViroL 67: 5030). Variations of two-hybrid systems 
have been used to study the in vivo activity of a proteolytic 40 
enzyme (Dasmahapatra et al. (1992) Proc. Natl. Acad. Sci. 
(USA) 89: 4159). Alternatively, an E. coIi/HCCV interactive 
screening system (Germino et al. (1993) Proc. Natl. Acad. 
Sci. (USA.) 90: 933; Guarente L (1993) Proc. Natl. Acad. 
Sci. (USA.) 90: 1639) can be used to identify interacting 45 
protein sequences (i.e., protein sequences which het- 
erodimerize or form higher order heteromultimers). 

Each of these two-hybrid methods rely upon a positive 
association between two Gal4 fusion proteins thereby recon- 
stituting a functional Gal4 transcriptional activator which so 
then induces transcription of a reporter gene operably linked 
to a Gal4 binding site. Transcription of the reporter gene 
produces a positive readout, typically manifested either (1) 
as an enzyme activity (e.g., f5-galactosidase) that can be 
identified by a colorimetric enzyme assay or (2) as enhanced 55 
cell growth on a defined medium (e.g., HIS3). A positive 
readout condition is generally identified as one or more of 
the following detectable conditions: (1) an increased tran- 
scription rate of a predetermined reporter gene, (2) an 
increased concentration or abundance of a polypeptide prod- 60 
uct encoded by a predetermined reporter gene, typically such 
as an enzyme which can be readily assayed in vivo, and/or 
(3) a selectable or otherwise identifiable phenotypic change 
in an organism (e.g., yeast) harboring the reverse two-hybrid 
system. Generally, a selectable or otherwise identifiable 65 
phenotypic change that characterizes a positive readout 
condition confers upon the organism either: a selective 
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growth advantage on a defined medium, a mating 
phenolypc, a characteristic morphology or developmental 
stage, drug resistance, or a detectable enzymatic activity 
(e.g., P-galactosidase, luciferase, alkaline phosphatase, and 
5 the like). 

Transcriptional activators arc proteins that positively 
regulate the expression of specific genes. They can be 
functionally dissected into two structural domains: one 
region that binds to specific DNA sequences and thereby 

10 confers specificity, and another region termed the activation 
domain that binds to protein components of the basal gene 
expression machinery (Ma and Ptashne (1988) Cell 55: 
443). These two domains need to be physically connected in 
order to function as a transcriptional activator. Two-hybrid 

l5 systems exploit this finding by hooking up an isolated DNA 
binding domain to one protein (protein X), while hooking up 
the isolated activation domain to another protein (protein Y). 
When X and Y interact to a significant extent, the DNA 
binding and activation domains will now be connected and 

20 the transcriptional activator function reconstituted (Fields 
and Song (1989) Nature 340: 245). The yeast host strain is 
engineered so that the reconstituted transcriptional activator 
drives the expression of a specific reporter gene such as 
HIS3 or lacZ, which provides the re ad -out for the protein - 

25 protein interaction (Field and Song (1989) op.cit.; Chein et 
al. (1991) op.cit.). One advantage of two-hybrid systems for 
monitoring protein -protein interactions is their sensitivity in 
detection of physically weak, but physiologically important, 
protein -protein interactions. As such it offers a significant 

30 advantage over other methods for detecting prole in -protein 
interactions (e.g., ELISA assay). 

The invention also provides host organisms (typically 
unicellular organisms) which harbor a FEN-1 protein two- 
hybrid system, typically in the form of polynucleotides 

35 encoding a first hybrid protein, a second hybrid protein, and 
a reporter gene, wherein said polynuclcotidc(s) arc cither 
stably replicated or introduced for transient expression. In an 
embodiment, the host organism is a yeast cell (e.g., Saccha- 
romyces cer\ r isiae) and in which the reporter gene transcrip- 

40 tional regulatory sequence comprises a Gal4-responsive 
promoter. 

Yeast comprising (1) an expression cassette encoding a 
GAL4 DNA binding domain (or GAL4 activator domain) 
fused to a binding fragment of FEN-1 (2) an expression 

45 cassette encoding a GAL4 DNA activator domain (or GAL4 
binding domain, respectively) fused to a member of a cDNA 
library, and (3) a reporter gene (e.g., (i-galactosidase) com- 
prising a cis-linked GAL4 transcriptional response element 
can be used for screening for cDNA sequences encoding 

50 polypeptides which bind to FEN-1 with high affinity. 

Yeast two-hybrid systems may be used to screen a mam- 
malian (typically human) cDN A expression library, wherein 
cDNAis fused to a GAL4 DNA binding domain or activator 
domain, and a FEN-1 polypeptide sequence is fused to a 

55 GAL4 activator domain or DNA binding domain, respec- 
tively. Such a yeast two-hybrid system can screen for 
cDNAs that encode proteins which bind to FEN-1 
sequences. For example, a cDNA library can be produced 
from mRNA from a human mature B cell (Namalwa) line 

60 (Ambrus et al. (1993) Proc. Natl. Acad. Sci. {USA.)) or 
other suitable cell type. Such a cDNA library cloned in a 
yeasl two-hybrid expression system (Chien et al. (1991) 
Proc. Natl. Acad. Sci. {USA.) 88: 9578) can be used to 
identify cDNAs which encode proteins that interact with 

65 FEN-1 and thereby produce expression of the GAL4- 
dependent reporter gene. Polypeptides which interact with 
FEN-1 can also be identified by immunoprecipitation of 
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FEN-1 with antibody and identification of co-precipitaiing 
species. Further, polypeptides that bind FEN-1 can be iden- 
tified by screening a peptide library (e.g., a bacteriophage 
peptide display library, a spatially defined VLSIPS peptide 
array, and the like) with a FEN-1 polypeptide, 5 

Once such cDNAs encoding FEN-1 -interacting polypep- 
tides are identified, they may be used for screening a bank 
of compounds (e.g., small molecule libraries) to identify 
agents which inhibit the binding interaction. Yeast compris- 
ing (1) an expression cassette encoding a GAL4 DNA 10 
binding domain (or GAL4 activator domain) fused to a 
binding fragment of FEN-1 (2) an expression cassette 
encoding a GAL4 DNA activator domain (or GAL4 binding 
domain, respectively) fused to the selected cDNA encoding 
the FEN-1 -interacting protein, and (3) a reporter gene (e.g., 15 
p-galactosidasc) comprising a cis- linked GAL4 transcrip- 
tional response element can be used for screening for agents 
which inhibit the cDNA-encoded polypeptide from binding 
to FEN-1 with high affinity. Such yeast are incubated with a 
test agent and expression of the reporter gene (e.g., 20 
P-galactosidase) is determined; the capacity of the agent to 
inhibit expression of the reporter gene as compared to a 
control culture identifies the agent as a candidate FEN-1 
modulatory agent. 

FEN-1 -modulating agents which reduce the cell's capac- 25 
ity to repair DNA damage or inhibit DNA replication (e.g., 
by competitively inhibiting endogenous naturally-occurring 
FEN-1) are candidate antineoplastic agcnLs or sensitizing 
agents which sensitize cells (e.g., neoplastic cells) to DNA 
damaging agents (e.g., alkylating agents and ionizing 30 
radiation). Candidate antineoplastic agents are then tested 
further for antineoplastic activity in assays which are rou- 
tinely used to predict suitability for use as human antine- 
oplastic drugs. Examples of these assays include, but are not 
limited to: (1) ability of the candidate agent to inhibit the 35 
ability of anchorage-independent transformed cells to grow 
in soft agar, (2) ability to reduce tumorigenicity of trans- 
formed cells transplanted into nu/mi mice, (3) ability to 
reverse morphological transformation of transformed cells, 
(4) ability to reduce growth of transplanted tumors in nu/nu 4U 
mice, (5) ability to inhibit formation of tumors or preneo- 
plastic cells in animal models of spontaneous or chemically- 
induced carcinogenesis, and (6) ability to induce a more 
differentiated phenotype in transformed cells to which the 
agent is applied. 45 

Assays for detecting the ability of agents to inhibit or 
augment the DNA flap binding and/or cleavage activity of 
FEN-1 provide for facile high-throughput screening of agent 
banks (e.g., compound libraries, peptide libraries, and the 5Q 
like) to identify FEN-1 antagonists or agonists. Such antago- 
nists and agonists may modulate FEN-1 activity and thereby 
modulate DNA repair competence and replicative potential. 

Administration of an efficacious dose of an agent capable 
of specifically inhibiting FEN-1 activity to a patient can be 55 
used as a therapeutic or prophylactic method for treating 
pathological conditions (e.g., cancer, inflammation, lym- 
p hop ro life rati ve diseases, autoimmune disease, neurodegen- 
erative diseases, and the like) which are effectively treated 
by modulating FEN-1 activity and DNA repair and replica- 60 
tion. Such agents which inhibit VDJ immunoglobulin 
recombination, isotype switching, T cell receptor gene 
shuffling, and the like may serve as immunosuppressant 
agents. 

DNA flap substrates, cleavage and binding reactions and 65 
the like are practiced with reference to the Experimental 
Examples and Harrington and Lieber (1994) Genes and 
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Dewlopment 8: 1344; Harrington and Lieber (1994) The 
EMBOJ. 13: 1235; Hiraoka ct al. (1995) Genomics 25: 220; 
and Harrington and Lieber (1995)7. Biol Chem. 270: 4503. 
Particular aqueous conditions may be selected by the prac- 

5 titioncr according to conventional- methods. For general 
guidance, the following buffered aqueous conditions may be 
used: 10-250 mM NaCl, 5-50 mM Tris HC1, pH 5-S, with 
optional addition of divalent cation(s) and/or metal chelators 
and/or nonionic detergents and/or membrane fractions. It is 

10 appreciated by those in the art that additions, deletions, 
modifications (such as pH) and substitutions (such as KC1 
substituting for NaCl or buffer substitution) may be made to 
these basic conditions. Modifications can be made lo the 
basic bindi ng reaction conditions so lo ng as specific _fla p_ 

I5 cleavage occurs in mec ontrol react ions). C onditions that do 
norgemiit^ecifi £^ind in g and cleavage in control reacti ons _ 
(no agent included) are' not suitable for use in tending 
assays. 

Preferably, for determining binding of FEN-1 polypep- 

20 tides to immobilized DNA flap structures, the FEN-1 
polypeptide species is labeled with a detectable marker. 
Suitable labeling includes, but is not limited to, radiolabel- 
ing by incorporation of a radiolabeled amino acid (e.g., 
14 C-labeled leucine, 3 H-labeled glycine, 35 S-labeled 

25 methionine), radiolabeling by post-translational radioiodi- 
nation with 125 I or 131 1 (e.g., Bolton-Hunter reaction and 
chloramine T), labeling by post-translational phosphoryla- 
tion with 32 p (e.g., phosphorylase and inorganic radiolabeled 
phosphate) fluorescent labeling by incorporation of a fluo- 

30 rescent label (e.g., fluorescein or rhod amine), or labeling by 
other conventional methods known in the art. In embodi- 
ments where one of the polypeptide species is immobilized 
by linkage to a substrate, the other polypeptide is generally 
labeled with a detectable marker. 

35 Labeled FEN-1 polypeptide(s) are contacted with immo- 
bilized flap substrates, or labeled flap substrates are incu- 
bated with immobilized FEN-1 polypeptides in the presence 
of unlabeled non-specific competitor under aqueous condi- 
tions as described herein. The time and temperature of 

40 incubation of a binding reaction may be varied, so long as 
the selected conditions permit specific binding to occur in a 
control reaction where no agent is present. Preferable 
embodiments employ a reaction temperature of about at 
least 15 degrees Centigrade, more preferably 35 to 42 

45 degrees Centigrade, and a time of incubation of approxi- 
mately at least 15 seconds, although longer incubation 
periods are preferable so that, in some embodiments, a 
binding equilibrium is attained. Binding kinetics and the 
thermodynamic stability of bound complexes determine the 

50 latitude available for varying the time, temperature, salt, pH, 
and other reaction conditions. However, for any particular 
embodiment, desired binding reaction conditions can be 
calibrated readily by the practitioner using conventional 
methods in the art, which may include binding analysis 

55 using Scatchard analysis, Hill analysis, and other methods 
{Proteins, Structures and Molecular Principles, (1984) 
Creighton (ed.), W. H. Freeman and Company, New York). 

Specific binding of labeled DNA flap substrate or FEN-1 
polypeptide to immobilized FEN-1 polypeptide or DNA flap 

60 substrate, respectively, is determined by including unlabeled 
competitor protein (e.g., albumin) and/or unlabeled DNA. 
After a binding reaction is completed, the amount of labeled 
species that is/are specifically bound to the immobilized 
species is detected. For example and not for limitation, after 

65 a suitable incubation period for binding, the aqueous phase 
containing non-immobilized labelled FEN-1 protein is 
removed and the substrate containing the immobilized DNA 
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flap substrate species and any labeled FEN-1 protein bound 
to it is washed with a suitable buffer, optionally containing 
unlabeled blocking agent(s), and the wash buffers) 
removed. After washing, the amount of detectable label 
remaining specifically bound to the immobilized DNA flap 5 
substrate is determined (e.g., by optical, enzymatic, 
autoradiographic, or other radiochemical methods). 

In some embodiments, addition of unlabeled blocking 
agents that inhibit non-specific binding are included. 
Examples of such blocking agents include, but are not 30 
limited to, the following: calf thymus DNA, salmon sperm 
DNA, yeast RNA, mixed sequence (random or pseudoran- 
dom sequence) oligonucleotides of various lengths, bovine 
serum albumin, nonionic detergents (NP-40, Tween, Triton 
X-100, etc.), nonfat dry milk proteins, Denhardfs reagent, 15 
polyvinylpyrrolidone, Ficoll, and other blocking agents. 
Practitioners may, in their discretion, select blocking agents 
at suitable concentrations to be included in binding assays. 

In embodiments where a polypeptide or DNA flap sub- 
strate is immobilized, covalem or noncovalent linkage to a 20 
substrate may be used. Covalent linkage chemistries include, 
but are not limited to, well-characterized methods known in 
the art (Kadonaga and Tijan (1986) Proc. Natl Acad Sci. 
(USA.) 83: 5889). One example, not for limitation, is 
covalent linkage to a substrate derivatized with cyanogen 25 
bromide (such as CNBr-derivatized Sepharose 4B). It may 
be desirable to use a spacer to reduce potential steric 
hindrance from the substrate. Noncovalent bonding of pro- 
teins to a substrate include, but are not limited to, bonding 
of the protein to a charged surface and binding with specific 30 
antibodies. 

In one embodiment, candidate therapeutic agents are 
identified by their ability lo block the binding of a FEN-1 
polypeptide to a DNA flap substrate. , 5 

Typically, a FEN-1 polypeptide used in these methods 
comprises an amino acid sequence identical to a naturally- 
occurring FEN-1 protein sequence, although mutant FEN-1 
polypeptides are sometimes used if the mutant FEN-1 
polypeptide binds to the DNA flap substrate under control 40 
assay conditions (e.g., physiological conditions). 

Methods for Forensic Identification 

The FEN-1 polynucleotide sequences of the present 
invention can be used for forensic identification of indi- 45 
vidual humans, such as for identification of decedents, 
determination of paternity, criminal identification, and the 
like. For example but not limitation, a DNA sample can be 
obtained from a person or from a cellular sample (e.g., crime 
scene evidence such as blood, saliva, semen, blood-stained 50 
gloves, blood spots on the door of a white Ford Bronco, and 
the like) and subjected to RFLP analysis, allclc-spccific 
PCR, or PCR cloning and sequencing of the amplification 
product to determine the structure of the FEN-1 gene region. 
On the basis of the FEN-1 gene structure, the individual 55 
from which the sample originated will be identified with 
respect to his/her FEN-1 genotype. The FEN-1 genotype 
may be used alone or in conjunction with other genetic 
markers to conclusively identify an individual or to rule out 
the individual as a possible perpetrator. 60 

In one embodiment, human genomic DNA samples from 
a population of individuals (typically at least 50 persons 
from various racial origins) are individually aliquoted into 
reaction vessels (e.g., a well on a microtitre plate). Each 
aliquot is digested (incubated) with one or more restriction 65 
enzymes (e.g., EcoRI, HindHI, Smal, BamHI, Sail, NotI, 
AccI, Apal, Bglll, Xbal, PstI) under suitable reaction con- 
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ditions (e.g., see New England Biolabs 1995 catalog). Cor- 
responding digestion products from each individual are 
loaded separately on an electrophoretic gel (typically 
agarose), clectrophorescd, blotted to a membrane by South - 
5 era blotting, and hybridized with a labeled FEN-1 probe 
(e.g., a full-length FEN-1 cDNA sequence of FIG. 1 (B), 
FIG. 2 (B), or FIG. 5. Restriction fragments (bands) which 
are polymorphic among members of the population are used 
as a basis to discriminate FEN-1 genotypes and thereby 
10 classify individuals on the basis of their FEN-1 genotype. 
Similar categorization of FEN-1 genotypes may be per- 
formed by sequencing PCR amplification products from a 
population of individuals and using sequence polymor- 
phisms to identify alleles (genotypes), and thereby identify 
15 or classify individuals. 

The invention also provides FEN-1 polynucleotide probes 
for diagnosis of disease states (e.g., neoplasia or 
preneoplasia) by detection of a FEN-1 mRNAor rearrange- 
ments or amplification of the FEN-1 gene in cells explanied 
20 from a patient, or detection of a pathognomonic FEN-1 
O allele (e.g., by RFLP or allele -specific PCR analysis). 

Typically, the detection will be by in situ hybridization using 
a labeled (e.g., 32 p, 35 S, 14 C, 3 H, fluorescent, biotinylated, 
digoxigeninylated) FEN-1 polynucleotide, although North - 
25 era blotting, dot blotting, or solution hybridization on bulk 
RNA or poly A* RNA isolated from a cell sample may be 
used, as may PCR amplification using FEN-1 -specific prim- 
ers. Cells which contain an altered amount of FEN-1 mRNA 
as compared to non -neoplastic cells of the same cell type(s) 
will be identified as candidate diseased cells. Similarly, the 
detection of pathognomonic rearrangements or amplification 
of the FEN-1 gene locus or closely linked loci in a cell 
sample will identify the presence of a pathological condition 
or a predisposition to developing a pathological condition 
35 (e.g., cancer, genetic disease). The polynucleotide probes are 
also used for forensic identification of individuals, such as 
for paternity testing or identification of criminal suspects 
(e.g., O. J. Simpson) or unknown decedents. 

40 Methods of Rational Drug Design 

FEN-1 polypeptides can be used for rational drug design 
of candidate FEN-l-modulating agents (e.g., antineoplastics 
and immu no modulators). The substantially purified FEN-1 
45 and the identification of FEN-1 as a docking partner for 5' 
DNA flap substrates as provided herein permits production 
of substantially pure DNA flap/FEN-1 complexes (and DNA 
nick/FEN-1 complexes) and substantially pure FEN-1 
polypeptides. *ITie disclosed sequences and protein sources 
50 provide data for computational models which can be used 
for protein X-ray crystallography or other structure analysis 
methods, such as the DOCK program (Kuntz et al (1982) J. 
Mol BioL 161: 269; Kuntz ID (1992) Science 257: 1078) 
and variants thereof. Potential therapeutic drugs may be 
55 designed rationally on the basis of structural information 
thus provided. In one embodiment, such drugs are designed 
to prevent forma tion of a FEN-1 polypeptide: DNA flap 
complex. Thus, the present invention may be used to design 
drugs, including drugs with a capacity to inhibit binding of 
60 FEN-1 to DNA flaps or nicks and to catalyze nuclease 
activity on the flap strand. In one variation, such drugs are 
structural mimics of a FEN-1 polypeptide sequence. 
- Use of FEN-1 in Diagnostic Assays 
/ The invention also provides a novel diagnostic assay, 

/ 65 comprising contacting a sample believed to potentially con- 
V tain a predetermined target polynucleotide sequence (e.g., a 
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capable of specific hybridization to all or a portion of said 
target polynucleotide under assay conditions, and forming as 
a result of the hybridization a 5* flap structure which can be 
cleaved by FEN-1 releasing nucleotides (or polynucleotides) 
in the flap strand; incu bating with FEN-1 and detecting the 5 
release of nucle otiaes ^r polynucleo tides) o^ the flap strand, 
ihe release ot w riut1emfdes'"(6r ,J polynu& 
^rep orun& _the formation of a flap structure and thereby 
reporting the presence, and op tionaliy L quaTiuTyT^nnT^pre - 
determined target polynucleotide sequence in the sample as 10 
proportional to the amount of cleaved, released nucleotides 
or polynucleotides. Typically, the probe polynucleotide 
comprises two portions, a fi rst portion whicji h ybridizes to 
the targe t sequ ence and a second port ion which is adja cent 
to^ajd^firsT porubn and which~forms the flap_stra nd; fre- 15 
quently an adjacent polynucleotide is^present which Hybrid- 
izes to the portion of the target polynucleotide immediately 
5^ to the portion .of the targe t which hybridizes to the probe 
polynucleotide sequence. The portion of the probe poly- 
nucleotide which forms the flap is typically labelled, and the 20 
entire probe may be labelled; in some embodiments, the 
target polynucleoti de is ^^end-labeled. Often, the probe 
polynucleotide is immobilized. The release of label in the 
presence of FEN-1 measures the abundance of target poly- 
nucleotide in the sample. For illustration and not limitation, 25 
with reference to FIG. 6, a probe may correspond to the flap 
strand, a target polynucleotide may correspond to the bridge 
strand (F fcr ), and an adjacent strand may correspond to the 
F a jj strand, as shown. Alternatively, a probe polynucleotide, 
typically labelled, may be immobilized via its 5' end such 3D 
that hybridization to the target polynucleotide will form a 
cl eavab lejjaf] ^which^nbe cleaved by FEN-1 releasing the 
cfeaved portion of the probe^poiynucleotid e wKicrTis^n^ DnH" 
ize d tQlQ lg^igJTggi; quantitating the am ount of releasgTa^ T 
t rlereb ^ffi 

Kits and FEN-1 Reagents 

The invention also provides kits comprising a vial con- 
taining substantially purified FEN-1 having 5' flap cleavage 40 
activity; such kits can be sold for practicing polynucleotide 
diagnostic assays according to the methods described. Vials 
of purified FEN-1 can be sold to the clinical lab or scientist 
as commercial reagents, just like conventional diagnostic 
products or laboratory biologicals (e.g., restriction enzymes, 45 
Taq polymerase, monoclonal antibodies), categories which 
have sufficient utility to have merited the granting of numer- 
ous U.S. patents. Such reagents which can be used in 
research and diagnostics have been patented; for example, 
the restriction endonucleases AscI, Fsel, Pmel, Xcyl, SplI, 50 
Srfl, and Apol, among others, have been patented in the U.S. 
(see, U.S. Pat. Nos. 5,061,628, 5,196,330, 4,588,689, 4,886, 
756, 5,300,432, 5,200,336, and 5,200,337), among others. 

The following examples are given to illustrate the 
invention, but are not to be limiting thereof. All percentages ss 
given throughout the specification are based upon weight 
unless otherwise indicated. All protein molecular weights 
are based on mean average molecular weights unless oth- 
erwise indicated. 

EXPERIMENTAL EXAMPLES 60 

Example 1: Identification of an Activity Which 
Cuts DNA Raps 

Flap structures have been proposed as intermediates in a 
variety of systems including DNA end-joining, homologous 65 
recombination and DNA replication. In order to identify 
mammalian flap cutting activities in vitro, we designed the 
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synthetic flap structure show in FIG. 6. This structure, called 
Flap Substrate 1, was created by annealing three oligonucle- 
otides together. We refer to these individual oligonucleotide 
as the flap strand, the T br (bridge) strand and the f ad/ 

5 (adjacent) strand. The F hr strand is complementary to both 
the flap strand- and the lt a<JJ strand, and therefore bridges the 
two strands allowing the structure to form. Flap Substrate 1 
is a 5' flap-structure because the flap strand terminates with^ 
a 5' single -stranded end. Likewise, 3' flap structures have a 

10 flap strand which terminates in a 3 f end. 

Treatment of Flap Substrate 1 with crude nuclear extract 
from 1—8 pre-B ceiis orTromTNiri3T3fib rob lasts resulted in 
the formation of two major products. The 19 nl product was 
the result of a cut 1 nt distal to the elbow, in the single- 

15 stranded region of the flap strand. The 21 nt product was the 
result of a cut 1 nt proximal to the elbow, in the double - 
stranded region of the flap strand. These products could be 
detected with as little as 10 ng of protein from the 1-3 crude 
nuclear extract in the presence of a 1,000- fold molar excess 

* of salmon sperm DNA. Based on this specificity, we have 
named this enzyme flap endonuclease-1 (FEN-1). 

Analysis of nuclear extracts from other species indicated 
that flap cleavage activity is evolutionarily conserved. Wc 
25 detected flap cutting activity in extracts from calf thymus, 
rabbit reticulocytes, Chinese hamster fibroblasts and Droso- 
phila embryos. 



Purification of FEN-1 

30 

FEN-1 was purified from the nuclear fraction of a mouse 
pre-B cell line. The chromatographic behavior of FEN-1 was 
monitored using Flap Substrate 1 in the standard FEN-1 
endonuclease assay. 

35 FEN-1 activity eluted from the final column as a single 50 
kDa band as assessed by silver-stained SOS -PAGE. The 
amount of activity in each of the active fractions correlated 
with the amount of 50 kDa protein present. 



Biochemical properties of purified KEN-1 

Analysis of the effects of salt concentration, pH and 
divalent metal ion concentration on this reaction are shown 
in Table I and Table II. 

TABLE I 



Endoouciease activity characterization 



FEN-1 activity versus pH 

50 Ph Activity 

5 <1 

6 20 

7 fifi 

8 100 

55 9 62 

10 7 



FEN-1 activity versus 
1 monovalent mental ion] 

[Monovalent metal ion] (mM) Activity 



60 



65 



0 salt 


100 


50 NaCl 


14 


100 NaCl 


1 


150 NaCl 


<0.5 


50 KC1 


14 


100 KC1 


2 


150 KC1 


<1 



45 



TABLE I-continued 



Endonuclease activity characterization 



FEN-1 activity versus 
[divalent metal ion] 

[Divalent metal ion] (mM) % Activity 

1 MgCU 72 

10 MgClj 100 

1 MnCl 7 1121 

10 MnOj 2725 

1 CeCl 2 17 

1 ZnCl 2 2 

5 EDTA <1 

1 MgCl 2 and 1 CaCl 2 56 

1 MgCU and 1 ZnCU <1 



FEN-1 was optimally active at pH 8 in the presence of 0 
mM monovalent ions. NaCl and KC1 were equally inhibitory 
with "85% inhibition occurring at 50 mM monovalent salt. 2 o 
FEN-1 activity was relatively insensitive to pH. Activity was 
detected from pH 6 to 10. We also found that divalent metal 
ions were absolutely required for FEN-1 activity. Replace- 
ment of MgCU with 5 mM EDTA completely inhibited the 
reaction. Interestingly, substitution of MgCU with 1 mM 25 
MnCU resulted in the formation of ~10-fold more product. 
The concentration of MgCU affected the cleavage site 
preference of FEN-1. At 0.1 mM Mg 2 "*", the enzyme cleaved 
Flap Substrate 1 predominantly 1 nt proximal to the elbow 
to give the 21 nt product. As the Mg 2 * concentration was 30 
increased, more cutting occurred 1 nt distal to the elbow to 
give the 19 nt product. Maximum product was formed at 10 
mM MgCU which yielded an equimolar ratio of the two 
products. Interestingly, sequences near the flap protrusion 
also seemed to affect the preferred site of cleavage by 35 
FEN-1, as discussed below. 



TABLE II 



Exonuclease activity characterization 


FEN-1 activity versus pH 




pH 


% Activity 


5 


6 


6 


60 


7 


50 


8 


100 


9 


60 


10 
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FEN-1 activity versus 




[monovalent ion] 




{Monovalent ion] (m.VJ) 


% Activity 


0 salt 


100 


50 NaCl 


2 


100 NaCl 


<0.5 


150 NaCl 


<0.5 


50 KC1 


3 


100 KC1 


<0.5 


150 KC1 


<0.5 


FEN-1 activity versus 




[divalent metal ion] 




(Divalent ion] (mM) 


% Activity 


0.1 MgCl ? 


100 


1 MgCU 


55 


10 MgCt 2 


5 


0.1 MnCU 


132 


1 MnCU 


124 


10 MnCl 2 


166 



40 



55 



60 



65 
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TABLE II -continued 



Exonuclease activity characterization 



5 EDTA <0.5 
1 MgCl 2 and 1 ZnCl 2 <1 



Sequence independent cleavage and structure specificity 
A second flap structure, called Flap Substrate 2, as used to 
determine if FEN-1 has any sequence specificity. This 
substrate is different from Flap Substrate 1, but has the same 
5' flap structure. FEN-1 efficiently cleaved Flap Substrate 2. 
This indicates that FEN-1 recognizes the flap structure and 
not a specific sequence in the substrate. Flap Substrate 2 had 
only one major cut site which was located 1 nt proximal to 
the elbow, whereas Flap Substrate 1 had two major cut sites. 

The specificity of FEN-1 for flap structures was further 
tested by incubating purified FEN-1 with the labeled flap 
i strand alone. This strand was not cleaved when not annealed 
to the other strands. This result demonstrates that purified 
FEN-1 has no detectable single-strand endo- or exonuclease 
activities. Furthermore, incubation of FEN-1 with super- 
coiled pBluescript pi asm id did not result in any detectable 
; nicking or double-strand cleavage, demonstrating that 
FEN-1 does not have any double-strand endonuclease activ- 
ity. 

To test whether FEN-1 cleaves any single -strand/double - 
strand junction, FEN-1 was assayed Tor its ability to cleave 
i 5' or 3' overhangs. ^Cleavage of this structure near the 
single -si rand/doilble-slrand junction would result in the gen- 
eration of a product of "15 nt on a denaturing acr ylamidejel. 
These overhang struinuTeT^ere not cleaved, "sKowing that 
FEN-1 is not simply recognizing the single -strand/double - 
35 strand junctions, but instead U specific for th e flap^trucjure^ 

Strand specificity of FEN-1 

To determine whether FEN-1 cleaves flap structures in a 
strand-specific manner, the F,, r strand of Flap Substrate 1 
40 was 5' end -labeled. Cutting of the F hr strand near the flap 
protrusion would generate an ~14 nt labeled product on a 
denaturing acrylamide gel. Cutting of this strand did not 
occur, indicating that FEN-1 cleaves the flap structure in a 
strand-specific fashion. 

45 

Effect of Flap Length on FEN-1 Cleavage 

Models for DNA end-joining and homologous recombi- 
nation have been proposed which predict flap intermediates. 
The flap strands in these models can vary from 1 nt to several 

50 hundred in length. If FEN-1 is involved in DNA end-joining, 
one would predict that it chrmiH;c> r av<; ^hnrt hi ggle -stranded 
flaps of 1 to.5jtjnJe.nclh, To test this, we varied the length 
of - the' flap strand o f Flap Substrate 2 while holding the 
lengths of the F fo /and F^- strands constant. We found that 

55 FEN-1 was capable of cleaving both 1 and 5 nt flans 
efficientl y. This result demonstratesthat FtN-l is capable of 
resolving structures such as those proposed in "DNA end- 
joining, ~~ * ' 

6Q Ability of FEN-1 to Cleave Other Structures 

In addition to 5' flap structures, 3' fla p structures ha ve also 
been proposed ^ Unserve as~n^e^etliates ln'^horriolrigy 5 " 
dependent DNA end-joining. It was of interest, therefore, to 
test whether FEN-1 also cleaves flaps with a 3' single - 
65 stranded end. Using Flap Substrate 1 as a positive cont rol, no 
detectable cleavage of_tlie^3Lflaj) structure was ob^rved, 
even in the presence J of 15 U of FEN-1. w ~ 
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To analyze further the substrate specificity of FEN-1, 
several derivatives of the flap structure were tested. The 
presence of the V adJ strand was found to be required for 
efficient cleavage. In the case of Flap Suhstratc 2, cutting 
efficiency was reduced 100- fold when the F arf/ strand was 5 
absent. Analysis of other flap sequences, such as Flap 
Substrate 1, confirms the importance of the V adj strand, 
although the reduction in cutting efficiency was only 20-fold 
for this substrate. In addition, the absence of the V adj strand 
did not simply change the cleavage site to the V br stand. 10 
These results indicate that FEN-1 cleavage is dependent on 
the presence of the complete flap structure. It is conceivable, 
however, that efficient cleavage may occur in the presence of 
F^j strand which is recessed from the flap strand elbow by 
one or more bases. We tested this and found that cleavage 15 
efficiency dropped to 92%, 48% and 15% of that with Flap 
Substrate 1 when the ¥ adj strand was recessed by 1, 3 and 5 
nt, respectively. Thus, FEN-1 can cleave flap structures at a 
reduced efficiency in the presence of a recessed F rtrfy strand. 

T4 endo VII has also been shown to cleave Y-junctions. 20 
A Y-j unction can be made simply by annealing a fourth 
oligonucleotide to the single-stranded tail of the flap strand 
to make it double -stranded. We found that FEN-1 cleaved 
the single-stranded flap 200-fold more efficiently than the 
Y-junction. The residual cleavage may be the result of a 25 
small amount of flap substrate which is not annealed with the 
oligonucleotide SC6. Thus, the inability of FEN-1 to cleave 
Y-junctions efficiently further confirms its specificity for 5' 
single-stranded flaps. 

If the RN A strand of the Okazaki fragment is displaced by 30 
the incoming DNA polymerase, an RNA flap structure 
would be formed. We created this structure by replacing the 
labeled DNA flap strand (HJ42) of Flap Substrate 1 with a 
labeled RNA oligonucleotide of the same sequence (IIJ49). 
The resulting RNA flap sequence was not cleaved by FEN-1 . 35 

FEN-1 has 5'-3' Exonuclease Activity 

FEN-1 was incubated with a variety of end-labeled sub- 
strates. We found that our FEN-1 preparation had exonu- 40 
clease activity with a specificity for 5' recessed ends. The 
product of this reaction was low molecular weight nucle- 
otides ranging from one to several nucleotides in length as 
determined by high percentage sequencing polyacrylamide 
gels. This exonuclease did not act on blunt ends, 3* recessed 4S 
ends or single-stranded DNA. In addition, this exonuclease 
did not cleave RNA oligonucleotides alone or when 
annealed to DNA. These results indicate that the exonu- 
clease present in the FEN-1 preparation is a 5'-3' exonu- 
clease which is specific for double -stranded DNA. 5Q 

Based on the chromatographic elution profiles and the 
similar effects of salt, pH and divalent metal ions, it appears 
that FEN-1 is responsible for the exonuclease activity. 

Materials and Methods ^ 

DEAE Sephrose, heparin Sepharose, Phenyl Superose HR 
5/5, Blue Sepharose, Mono S HR 5/5 and Sephadex G-25 
were purchased from Pharmacia LKB, Inc. Hdroxylapatite, 
Bio rex 70 and low molecular weight SDS-PAGE standards 
were purchased from Bio-Rad [t- 32 P]ATP and [a- 32 P]dTTP 60 
(3000 Ci/'mmol) were purchased from Amersham. SDS, 
bovine serum albumin Fraction V (BSA) and rRNA were 
purchased from Sigma. Acrylamide was purchased from 
Boehringer Mannheim. Penicillin -streptomycin was pur- 
chased from Irvine Scientific. T4 polynucleotide kinase 65 
(PNK) was obtained from New England Biolabs. Exominus 
Klenow fragment of E. coli DNA Pol I was purchased from 
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United Slates Biochemical. Fetal bovine serum (FBS) and 
RPMI 1640 were purchased from Gibco-BRL. X-OMATAR 
film was purchased from International Biotechnologies, Inc. 

5 FEN-1 Purification 

All purification steps were carried out at 0°— 4° C. Six 
liters of the murine pre-B cell line, 1-8 (All et al., 1984), 
were grown to 2xl0 6 cells/ml in RPMI 1640, 10% FBS, 50 
/vM p-mercaptoethanol (pME), 100 /*g/ml penicillin- 

io streptomycin in spinner flasks. Eighteen liters of RPMI 1640 
and 100/*g/ml penicillin-streptomycin were then added, and 
the cells were allowed to grow for an additional 24 h. The 
cells were harvested at a cell density of 1-I.5xl0 b cells/ml 
and spun al 1000 g in a Bee km an J A- 10 rotor for 5 min. The 

15 cell pellet was then washed with 500 ml of ice-cold PBS (5 
mM NaP,. pH 7.4, 137 mM NaCl, 3 mM KC1) supplemented 
with 1 g/1 MgCl 2 and pelleted as above. Nuclei were isolated 
by allowing the cell pellet to swell in 100 ml Buffer A (10 
mM HEPES-KOH pH 7.8, 1.5 mM MgCl 2 , 10 mM KC1, 0.5 

20 mM D 11) for 10 min on ice. The swollen cells were spun 
for 5 min at 2500 g in a Beckmao J6B centrifuge, resus- 
pended in 40 ml Buffer A and homogenized with 20 strokes 
of the B-pestle in a Dounce homogenize r. The nuclei were 
collected by centrifugal ion at 2500 g in a Beck man J6B for 

25 5 min and lysed by adding 40 ml of Buffer B (20 mM 
EPES-KOH pH 7.8, 0.42M NaCl, 25% glycerol, 1.5 mM 
MgCU, 0.2 mM EDTA, 0.5 mM DTT, 0.5 mM PMSF). 
Nuclear lysis was completed with 20 strokes of the B-pestle 
in a Dounce homogenizes The mixture was incubated on ice 

30 for 30 min and then spun at 30 000 g in a Beckman JA-20 
rotor to remove nuclear debris. This clarified nuclear extract 
typically had a protein concentration of 2-4 mg/ml. To 
remove nucleic acids, the nuclear fraction was applied 
directly to a DEAE Sepharose column (1.6 cmx7 cm, 10 ml) 

35 equilibrated in Buffer C (50 mM Tris pH 7.4) containing 500 
mM NaCl. The flow-through was diluted with 10 volumes of 
Buffer D (50 mM Ins pH 8.5) and loaded onto a DEAE 
Sepharose column (2.6 cmx7.5 cm, 31 ml) which was 
equilibrated with Buffer D containing 50 mM NaCl. The 

40 flow-through from this second DEAE column was loaded 
directly onto a heparin Sepharose (1.6 cmx6 cm, 8 ml) 
column equilibrated in Buffer D containing 50 mM NaCl. 
Following loading, the heparin Sepharose column was 
washed with 40 ml Buffer D containing 50 mM NaCl and 

45 then with 40 ml of Buffer C containing 200 mM NaCl. 
FEN-1 was eluted with 40 ml of Buffer C containing 400 
mM NaCl. Solid (NII 4 ) 2 S0 4 (10.4 g) was added to the 
heparin Sepharose pool (40 ml) while stirring on ice. This 
solution was spun at 28 000 g for 15 min in a Beckman 

50 JA-20 rotor, and the supernatant was applied to an FPLC 
Phenyl Superose IIR 5/5 column which was equilibrated 
with Buffer F [25 mM Tris pH 7.4, 2M (NH.,) 2 S0 4 ]. 
Following loading, the column was washed with 10 ml of 
Buffer F. FEN-1 activity was eluted with a 20 ml linear 

55 gradient from Buffer F to Buffer G (50 mM Tris pH 8, 10% 
glycerol). The active fractions (6 ml) were diluted immedi- 
ately with 60 ml of Buffer H (5 mM KP„ pH 7) and loaded 
onto a hydroxylapatite column (1 cmx4 cm, 3 ml) equili- 
brated in Buffer H. The column was washed with 10 ml 

60 Buffer H, and FEN-1 was eluted with a 30 ml linear gradient 
from Buffer II to 500 mM KP ( pll 7. Active fractions were 
pooled (9 ml), diluted with 40 ml of Buffer G, and loaded 
onto a Blue Sepharose column (5 mmxlO mm, 1 ml) which 
was equilibrated in Buffer G containing 50 mM NaCl. The 

65 column was washed successively with 20 ml of Buffer G 
containing 50 mM NaCl and then with 20 ml Buffer G 
containing 380 mM NaCl. FEN-1 was eluted with 1M NaCl 
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in Buffer G. Active fractions (6 ml) were diluted with 44 ml 
of buffer G and loaded onto a Mono S HR 5/5 column which 
was equilibrated with Buffer G containing 50 mM NaCl. The 
column was washed with 10 ml Buffer G containing 50 mM 
NaCl and then with 10 ml Buffer I (50 mM Tris pH 9.5, 10% 5 
glycerol) containing 50 mM NaCl. The activity was e luted 
with a linear gradient from 50 mM NaCl to 600 mM NaCl 
in Buffer I. Active fractions were diluted 5 -fold with Buffer 
J (25 mM KP,., pH 7) and loaded onto a Biorex 70 column 
(1 cmxl cm, 1 ml). The column was washed with 1 ml 10 
Buffer J, and then eluted with a linear gradient from 0 to 700 
mM NaCl in Buffer J. One milliliter fractions were collected 
into siliconized polypropylene tubes. Active fractions were 
frozen at -70° C. in the presence of 100//g/ml BSA. 
Preparation of nuclear extracts is 

Approximately lxlO 9 cells were harvested from culture 
or from the indicated tissue. After washing the cells in 50 ml 
of PBS supplemented with 1 g/1 MgCl 2 , the cells and nuclei 
were lysed as described in the FEN-1 purification procedure 
except the volumes of Buffers A and B were 20-fold lower. 20 
These extracts were either used immediately or frozen in 
liquid nitrogen and stored at -80° C. until use. 
Oligonucleotides 

VU1 DNA oligonucleotides were purchased from Operon 
Technologies, Inc. (Alameda, Calif.). The SC series of 25 
oligonucleotides was a kind gift from Pat Brown 
(Department of Biochemistry, Stanford). The sequence of 
each is\shown after its name and is written from 5' to 3'. 
HJ39, CACGTTGACTGAATC (SEQ ID NO: 53); HJ40, 
ACCGTCnsTGAGGCAGAGT (SEQ ID NO: 54); ILF41, 30 
GGACTCTGCCTCAAGACGGTAGTCAACGTG (SEQ 
ID NO: 550; HJ42, CATGTCAAGCAGTC- 
CI AACTITG Aaj G C AG AGTCC (SEQ ID NO: 56); HJ43, 
CACGTTGACTACCGTC (SEQ ID NO: 57); IIJ47, GTAG- 
GAGATGTCCCTTGATGAATTC (SEQ ID NO: 58); SCI, 35 
CAGCAACGCAAGCTTG (SEQ ID NO: 59); SC2, TAG- 
CAGGCI'GCAGG TGGAC (SEQ ID NO: 60); SC3, GTC- 
GACCTGCAGCCCAaGCTTGCGTTGCTG (SEQ ID NO: 
61); SC4, AGGCTGCAGGTCGAC (SEQ ID NO: 62); SC5, 
ATGTGGAAAATCTCTAGCAGGCTGCAGGTCGAC 40 
(SEQ ID NO: 63); SC6\ TG CTAG AG ATTTTCC AC AT 
(SEQ ID NO: 64). In addhW two RNA oligonucleotides 
were purchased from NatrtMial Biosciences (Plymouth, 
Minn.). The sequence of each rfe shown after its name and is 
written from 5* to 3'. HJ49K GATGTCAAGCAGTC- 45 
CTAACTTTGAGGCAGAG\CC; HJ50, CCCA- 
GATACGG. Prior to use, each oligonucleotide was gel 
purified on a 15% denaturing polyacryiamidc gel and recov- 
ered by 'crush and soak* (SambrookeV al., 1989). 
Preparation of flap substrates and derivatives 50 

Substrates were prepared by first 5' end-labeling 5 pmol of 
the strand on which cutting was to be measured using 
[t- 32 P]ATP and T4 polynucleotide kinase (PNK) according 
to Sambrook et al. (1989). PNK and unincorporated nucle- 
otides wcrc.rcmoved by phenol extraction followed by spin 55 
column gel filtration through a 1 ml Sephadex G-25 column. 
ITie specific activity of all labeled oligonucleotide was then 
annealed to 10 pmol of the other oligonucleotides in 50 fi\ of 
20 mM Tris pH 7.4, 150 mM NaCl by boiling the tube for 
2 min in 300 ml H 2 0. The oligonucleotides were allowed to 60 
cool slowly to 4° C. in the same 300 ml-)H 2 0. 
Flap endo nuclease assay 

Flap cutting activity was measured in a 15 fi\ reaction 
containing 50 mM Tris pH 8, 10 mM MgCU, 0.5 mM |3ME, 
100 ug/ml BSA, and-10 fmol of 5* end-labeled substrate. 65 
When crude fractions were assayed, 0.5 jt/g sonicated salmon 
sperm DNA(1 kb average size) was added to compete away 
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non-specific nucleases. After 30 min at 30° C, the reaction 
was terminated by adding 15 u\ of 95% formamide, 10 mM 
EDTA, 1 mg/ml bromophenol blue, 1 mg'ml xylene cyanole. 
The lubes were heated to 95° C. for 5 min, loaded onto a 

5 10% or 15% polyacrylamide gel (35 cmx42.5 cm) contain- 
ing 7M urea and'lxTBE (89 mM Tris, 89 mM boric acid, 2 
mM EDTA, pH 8), and run for 90 min at 75 W. Reaction 
products were visualized by autoradiography or quantified 
using a Phosphorlmager (Molecular Dynamics). A unit of 

10 activity is defined as the amount of enzyme required to 
cleave 1 fmol of Flap Substrate 1 under standard FEN-1 
endonuclease assay conditions. 
Oligonucleotide exonuclease assay 

These structures were labeled and annealed using the 

15 same procedure described for the flap substrates. One excep- 
tion to this is the 3' labeled recessed end substrate. This 
substrate was made by first annealing oligonucleotides HJ41 
and HJ43. The annealed substrate was then 3' end -labeled 
with 1 U of Exo-minus Klenow in the presence of only 

20 [a- 32 P]dTTP. This allows no more than two nucleotide bases 
to be incorporated. Following labeling and annealing, the 
substrates were purified by running a 20% native polyacry- 
lamide gel to ensure that all unincorporated nucleotide was 
removed. The substrates were cluled from the gel by crush 

25 and soak at 30° C. (Sambrook et al.), 1989). Following 
precipitation, the DNA was resuspended in 20 mM Tris pH 
7.4, 150 mM NaCl. 

The standard oligonucleotide FEN-1 exonuclease assay 
was carried out in a 15 /d reaction containing 10 fmol of 

30 labeled substrate in 10 mM Tris pi I 8.0.1 mM MgCU, 0.5 
mM pME for 30 min at 30° C. Reactions were stopped by 
the addition of 15 /d 95% formamide, 5 mM EDTA, 1 mg/ml 
xylene cyanole and 1 mg/ml bromophenol blue. The tubes 
were then heated to 95° C. for 5 min and loaded onto a 15% 

35 polyacrylamide gel (17 cmxl5 cm) containing 7M urea and 
IxTBE. The reaction products were visualized by autorad- 
iography or quantified using a Phosphorlmager (Molecular 
Dynamics). 

Poly[d(AT)] exonuclease assay 

40 Poly[d(AT)] was labeled according to Goulian et al. 
(1990) except that [a- 3 'P]dTTP was used in place of [ 3 M] 
dTTP. The specific activity of the final DNase I nicked 
substrate was 3xl0 6 c.p.m./nmol. This substrate (30 ng) was 
incubated with 10 U of FEN-1 in 10 mM Tris pi I 8, 0.1 mM 

45 MgCl 2 , 0.5 mM pME and 100 ^g/ml BSA in a 100 /d 
reaction. Following a 30 min incubation at 30° C, the 
reaction was stopped by adding 25 /d salmon sperm DNA (3 
mg/ml) and 125 u\ of 20% TCA. The rube was incubated on 
ice for 5 min and then spun in a microfuge for 15 min at 15 

50 000 r.p.m. The supernatant was removed, and the acid- 
soluble counts were determined in a scintillation counter 
(Beckman). 

Example 2 Cloning (he cDNA encoding Mouse 

DNA flap substrate 1 was designed to detect structure- 
specific endonucleases in mammalian cells, (FIG. 6). This 
substrate is a 5' flap structure because the flap strand 
terminates with a 5* single -stranded end. Conversely, 3' flap 

60 structures have a flap strand that terminates with a 3' 
single-stranded end. Both 5' and 3' flap structures are com- 
posed of a flap strand, an V br (bridge) strand, and an V adj 
(adjacent) strand. 

FEN-1, which specifically cleaves 5' flap structures but 

65 not 3' flap structures. Cleavage of flap substrate 1 occurs 
primarily at 1 nucleotide proximal and 1 nucleotide distal to 
the elbow of the flap strand. Other 5' flap structures, 
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however, are cleaved by FEN-1 primarily at 1 nucleotide 
proximal to the flap strand elbow. 

We microsequenced tryptic fragments of purified murine 
FEN-1 and used this amino acid sequence to design degen- 
erate oligonucleotide probes. An initial screen of 300,000 5 
plaques from a mouse thymocyte cDNA library yielded 
eight positive clones that were related by restriction digest. 
The largest clone was 2.1 kb and produced a protein of the 
25 expected molecular weight when transcribed and trans- 
lated in vitro. io 
Materials and methods 

Cloning of FEN-1 and YKL510. Tryptic fragments from 
35 pmoles of FEN-1 (as determined by amino acid analysis) 
were HPLC purified and sequenced (Aebersold et al. 1987). 
Degenerate oligonucleotides HJ53 and HJ54 were designed, 15 
5'-end-labeled with [t 32 P]ATP, and hybridized to >*ZAP 
phage plaques containing a mouse thymocyte cDNA library 
(Sambrook et al. 1989). Positives were plaque purified and 
converted to plasmids according to the manufacturer's 
instructions (Stratagene). PstI and PvuII restriction frag- 20 
ments of the largest clone designated pBS-FEN, were ligated 
into pBlue-script to form overlapping subclones. Each sub- 
clone was then sequenced on both strands (Sanger et al., 
1977). Alignment of FEN-1 with related genes was carried 
out by use of the Blast server (Altschul et al. 1990). 25 

Construction of E. coli expression vectors. FEN-1 was 
liberated from the pBS-FEN by digestion with Ncol and 
BaraHI and cloned directionally into the Ncol and BamHI 
sites of PET lid (Studier et al. 1990) to create PET-FEN. 
YKL510 was cloned from S. cerevisiae genomic DNA by 3D 
PCR with primers HJ60 and HJ61. Primlcr HJ60 created an 
Ncol site at the ATG translation start of YKL510, which 
allowed the PCR product to be cloned into the Ncol site of 
PET lid to create PET-YKL. RAD2 was subcloned from 
pNF2000 (Naumovski and Friedberg 1984) into the Sail site 35 
of pBluescript to create pBS-RAD2. RAD2 was amplified 
by PCR from pBS-RAD2 using primers HJ62 and VBSK1. 
These primers created an Ncol site al the ATG translation 
start of RAD2 allowing the PCR product to be cloned 
directionally into the Ncol and BamHI sites of PET lid to 40 
create PET-RAD2. PET- A RAD 2 was created by deleting the 
region between the Ncol site and the Sful site in PET-RAD2 
and replacing il with the N region of RAD2, which was PCR 
amplified from pBS-RAD2 with primers ILJ62 and MJ63. 

Overproduction of enzymes and extract preparation. PET- 45 
FEN, PET-YKL, PET-RAD2, PET-ARAD2, AND PET lid 
were transformed separately into the E. coli strain BL21 
(DE3) (Studier ct al. 1990) by clcctroporation and plated 
onto LB plates containing 100//g/ml of ampicillin. Colonies 
from each transformation were inoculated into 2 liters of LB 50 
containing 100//g/ml of ampicillin. The culture was shaken 
at 37° C. until the OD 600 =0.8 and then induced with 0.5 mM 
IPTG for 2 hr. Following induction, cells were harvested by 
centrifugation, resuspended in 40 ml of buffer A (50 mM 
Tris at pH 8.5, 50 mM NaCl), and lysed by son ica lion. 55 
Lysates were cleared by centrifugation at 25,000 g and 
assayed for flap cleavage. 

Enzyme purifications. FEN-1 was purified from 2 liters of 
E. coli, BL21(DE3), containing the FEN-1 expression vector 
PET- FEN. The crude extract from these cells was prepared 60 
as described above and loaded onto a DEAE-Sepharose 
column (2.6x8 cm, 30 ml). The fiowthrough was collected 
and loaded directly onto a heparin-Sepharose column (1.6x4 
cm, 6 ml). The column was washed with buffer B (50 mM 
Tris at pH 7.4) containing 200 mM NaCl,- and FEN-1 was 65 
eluted with buffer B containing 400 mM NaCl. The heparin- 
Sepharose pool (25 ml) was diluted with 50 ml of buffer C 
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(20 mM KP, at pH 7, 5 mM EDTA) and loaded onto a 
denatured DNA agarose column (1.6x3 cm, 5 ml). Follow- 
ing loading, the column was washed with buffer C and 
developed with a linear gradient from 0 to 600 mM KC1 in 
5 buffer C. Active fractions were pooled and diluted with 30 
ml of buffer D (50 mM Tris at pH 9.5) and loaded onto a 
Mono S HR 5/5 column. FEN-1 wascluted with a gradient 
from 0-600 mM NaCl in buffer D. Active fractions were 
pooled and loaded directly onto a hydroxylapatite column 

10 (1.6x3 cm, 5 ml), which was subsequently developed with 
a gradient from 5 to 350 mM KP, at pH 7. The hydroxyla- 
patite fractions containing FEN-1 activity were pooled and 
labeled purified recombinant FEN-1 (fraction V). 

YKL510 and ARAD2 were purified from E. coli by use of 

15 the same procedure as FENl and were found to behave 
similarly to FEN-1 at each chromatographic step. Purified 
YKL510 and ARAD2 from the hydroxylapatite column arc 
designated fraction V. As a control, extracts from E. coli 
cells containing PET lid were prepared and purified exactly 

20 as done for YKL510 and ARAD2. At each column, fractions 
corresponding to active fractions in the YKL510 or ARAD2 
purifications were pooled. The final purified material was 
found to be devoid of detectable nuclease activity and is 
designated purified PET (fraction V). 

25 Nuclease Assays. The standard flap cleavage assay was 
performed as described previously (Harrington and Lieber 
1994). Briefly, oligonucleotide IU42 was phosphorylated 
with [>. 32 P]ATP and annealed to oligonucleotides HJ41 and 
HJ43 to produce flap substrate 1. Reactions contained 10 

30 fmoles of flap substrate 1 in 15 //l of 50 mM Tris (pi I 8), 10 
mM MgCl 2 , 100/^g/ml of BSA, and the indicated amount of 
extract or enzyme. Following incubation at 30° C. for 30 
min, the reaction was terminated by the additional of 15 ji\ 
of 95% formamide, 1 mg/ml of bromophenol blue, and 1 

35 mg/ml of xylene cyanole and then heated to 95° C. for 5 min. 
Products were separated on a 15% denaturing polyacryla- 
mide gel (7M urea) and visualized on a Phosphor Imager. 
One unit of nuclease activity is defined as the amount of 
enzyme required to cleave 1 fmole of substrate under 

4U standard assay conditions. For FEN-1, YKL5 10, or ARAD2, 
1 unit of flap cleavage activityaiO.l ng of purified protein. 

The exonuclease assay was performed as described pre- 
viously (Harrington and Lieber 1994). Briefly, oligonucle- 
otide SC6 was phosphorylated with [X 32 P]ATP and annealed 

45 to oligonucleotide SC5. The annealed oligonucleotide sub- 
strate was purified on a 15% native acrylamide gel to remove 
all unincorporated label. Reactions were performed as in the 
standard flap assay; however, flap substrate 1 was replaced 
with SC5/6, and the concentration of MgCl 2 was 0.1 mM. 

50 The FEN-1 CDNA insert was subcloned into the E. coli 
expression vector, PET lid (Studier et al. 1990) to create 
PET-FEN. On induction, cells containing PET-FEN, but not 
PET lid, produced a soluble 50-kD protein (data not 
shown). In addition, extracts from cells containing PET-FEN 

55 specifically cleaved a 5' flap structure, whereas extracts from 
cells containing PET lid did not. Cleavage of the flap 
substrate occurred near the elbow of the displaced flap 
strand and resulted in labeled products that were 17-23 
nucleotides in length. Using a modification of the purifica- 

60 tion procedure described supra, recombinant FEN-1 was 
purified to a single band on a Coo ma ssie -stained SDS 
polyacrylamide gel. 

To confirm that the above cDNA clone encodes FEN-1, 
the purified recombinant FEN-1 (fraction V) was compared 

65 with FEN-1 that was purified previously from mouse lym- 
phocytes. The enzymatic activities and specificities of these 
two enzyme preparations were identical. Recombinant 
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FEN-1 cleaved three different 5' flap structures and had 5 -3* 
double-strand-specific exonuclease activity but did not 
cleave 5' pseudo-Y structures or single-stranded DNA. A 5' 
pseudo-Y structures or single-stranded DNA. A 5' pseudo-Y 
structure is a 5' flap structure that is missing the F rtrfy strand. 5 
In addition, both enzyme preparations failed to cleave 3* flap 
structures. 

By Western blot, antibodies generated against recombi- 
nant FEN-1 (fraction V) recognize FEN-1 purified from 
lymphocytes. These antibodies also inhibit murine lympho- 10 
cyle FEN-1 nuclease activity when present in the reaction, 
whereas control antibodies do not. Taken together, these 
results clearly indicate that the cDNA clone described below 
encodes FEN-1. 

The nucleotide sequence of FEN-1 revealed an open 15 
reading frame of 1134 bp (FIG. 2). The amino acid sequence 
of this putative polypeptide was found to be highly homolo- 
gous to the RAD2 protein family. The strongest homology 
was found between FEN-1 and YKL510, an open reading 
frame in Saccharomyces cerevisiae with previously 20 
unknown function and unknown expression status. In addi- 
tion to being the same size, FEN-1 and YKL510 were 60% 
identical (78% similar) at the amino acid level. FEN-1 was 
also found to be equally homologous in sequence and 
structure to the Schizosaccharomyces pombe Rad2 gene 25 
(Lehmann et al. 1991; Carr et al. 1993). 

Significant homology also found between FEN-1 and 
regions within 5. cerevisiae RAD 2, termed N, I, and C. 

One difference within the RAD2 family is that the N and 
I regions are separated to varying extents. In FEN-1 and 30 
YKL510, these regions are separated by —15 amino acids, 
whereas in RAD 2, this separation is over 600 amino acids. 
We have designated this intervening sequence the S- region 
(spacer). 

On the basis of the high degree of conservation between 35 
FEN 1 and YKL510, it is likely that YKL510 is the yeast 
analog of FEN-1. We cloned the YKL510 gene into PET lid 
and expressed it in E. cnli. We found thai extracts from cells 
containing PET-YKL specifically cleaved a 5* flap structure, 
whereas extracts from cells containing PET lid were inac- 4U 
live on this substrate. Using the standard flap cleavage assay 
to monitor chromatographic behavior, recombinant YKL510 
was purified from E. coli and found to migrate as a 49-kD 
band on SDS-PAGE. As a control, extract from cells con- 
taining PET lid was purified in exactly the same way as 45 
YKL510 and is designated PET (fraction V). No protein 
could be detected in this fraction on a Coomassie-stained 
SDA-polyacrylamidc gel (data not shown). In addition, PET 
(fraction V) was found to be devoid of nuclease activity. 

Analysis of substrate specificity revealed that purified 50 
YKL510 cleaved 5* flap structures but not a 3' flap structure 
or single-stranded DNA. Like FEN-1, YKL510 efficiently 
cleaved 5' flapped structures independent of flap strand 
length and failed to cleave 5' pseudo-Y structures. In 
addition, YKIJ510 was found lo have 5-3* double-strand- 55 
specific exonuclease activity similar to that of FEN-1. The 
product of YKL510 exonuclease activity was mono- and 
dinucleotides as determined by high percentage sequencing 
gels. The similar sequence, structure, and enzymatic activi- 
ties of these two enzymes indicates that YKL510 is the yeast 60 
analog of FEN-1. 

Rad2 is homologous to FEN-1 and YKI510 in three 
major regions. The separation of the N and I regions by the 
S region in RAD2, however, represents one difference 
between these two RAD2 family subtypes. 65 

To determine whether the S region of RAD2 can be 
deleted to give a truncated RAD2 protein that is still a 
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functional nuclease, the RAD 2 gene was modified to pro- 
duce ARAD2. Expression of ARAD2 from PET lid pro- 
duced a 5' flap cleavage activity that was absent to extracts 
from cells containing PET lid. ARAD2 was purified by use 

5 of the standard flap assay to monitor its chromatographic 
behavior. As a control, a side-by-side purification of extracts 
from cells containing PET lid was carried out. On a 
silver-stained SDA-polyacrylamide gel, 30 ng of ARAD2 
(fraction V) contained one 53-kD band, the expected size of 

10 ARAD2 (data not shown). 

To further ascertain purity, purified PET (fraction V) was 
assayed for nuclease activity. 

We analyzed the substrate specificities of purified ARAD2 
and found that it was a structure -specific endonuclease that 

15 cleaves 5 1 flap structures but not 3' flap structures. Like 
FEN-1 and YKL510, ARAD2 also did not cleave 3' 
pseudo-Y structures (data not shown). A 3' pseudo-Y struc- 
ture is a 3' flap structure that is missing the F adj strand. 
Interestingly, A RAD 2 did cleave 5* pseudo-Y structures. 

20 ARAD2 differed from FEN-1 and YKL510 in its ability to 
act as a 5 -3' double-strand specific exonuclease. FEN-1 and 
YKL510 exonuclease activities were approximately half as 
efficient as their respective DNA flap cleavage activities. 
ARAD2, on the other hand, had only a weak 5'-3' exonu- 

25 clease activity that was 100 times less efficient relative to its 
flap cleavage activity. 'Ifcus, the primary activity of ARAD2 
appears to be endonucleolytic in nature. 

Finally, ARAD2 did not cleave single-stranded DNA 
oligonucleotides. Although ARAD2 is most active on flap 

30 structures, cleavage of pseudo-Y structures is easily detected 
in our assay. 

The FEN-1 gene is the first characterized DNA structure- 
specific endonuclease to be cloned from any eukaryote. 

All three enzymes cleave 5' flap structures and fail to 
35 cleave other DNA structures, including 3' flaps and single- 
stranded DNA. Unlike FEN-1 and YKL510, ARAD2 effi- 
ciently cleaves 5' pseudo-Y structures and has only a weak 
exonuclease activity when compared with its flap endonu- 
clease activity. 

40 FEN-1 is likely involved in the removal of Okazaki 
fragment primers during replication and in the removal of 
damaged bases from DNA. E coli DNA polymerase I has 
been shown to be involved in DNA replication and DNA 
repair (Baker and Kornberg 1992). The intrinsic 5 -3' exo- 

45 nuclease domain of DNA polymerase I (Pol I) has been 
shown to be absolutely required for both of these functions. 
In addition, the 5 -3" exonuclease domain has been shown to 
be a structure-specific endonuclease that cleaves 5' flap 
structure (Lyamichev et al. 1993). Eukaryotic polymerases, 

50 however, do not have an intrinsic 5'-3' exonuclease domain. 
It has been proposed that this activity is localized to the 
replication fork by protein -protein interactions with DNA 
polymerase c. DNA polymerase c (Pol c) has also been 
shown to be involved in the repair of UV-damaged DNA. 

55 Recently, it has been shown that the calf thymus 5*-3' 
exonuclease also is a flap endonuclease that interacts func- 
tionally with DNA Pol e (Murante et al. 1994). On the basis 
of the size and enzymatic properties of this enzyme, it is 
likely to be the bovine analog of murine FEN-1. Thus, the 

60 DNA pol e/FEN-1 complex can be a eukaryotic counterpart 
to £. coli DNA Pol I. 

We have shown that deletion of the spacer region of 
RAD2 does not destroy its nuclease activity. This indicates 
that the nuclease active site is either in the N,I, or C regions. 

65 On the basis of the enzymatic activities of ARAD2 
described here, we propose that RAD2 does not recognize 
single-stranded DNA per se but, instead, recognizes precise 
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branched DNA structures. Furthermore, the orientation- 
specific cleavage of these structures by ARAD2 has enabled 
us to predict the placement of DNA scissions by RAD2 
relative to the damaged base(s). 

The inability of ARAD2 to cleave 3' pseudo-Y structures 5 
or single-stranded DNA and its ability to cleave 5' pseudo-Y 
structures, therefore, indicates that RAD2 cleaves the dam- 
aged strand on the 3' side of the damage. 

EXAMPLE 3 Isolation of Human FEN-1 cDNA 

and Genomic Clones 10 
Materials and Methods 

Isolation of human FENI cDNA and genomic clones. The 
mouse Fen-1 cDNA has been isolated and sequenced 
previously, supra. This cDNA was used to screen a human 
genomic phage library (Stratagene 946205) and a human J5 
leukemic T-cell cDNA phage library (Clontcch 
HLl01078a). The nitrocellulose filters (Schleicher and 
Schuell) were hybridized in 3x SSC, 1 mM EDTA, 10 mM 
Tris, pH 8.0, lx Denhardl's, and 100 ,«g/ml salmon sperm 
DNA at 60° C. overnight with 32 P-random primed probe. 
These filters were washed in 2x SSC and 0.5% SDS at room 20 
temperature for 10 min, followed by two 10-min washes at 
60° C. in 0.5x SSC and 0.5% SDS. 

Somatic cell hybrids and radiation -reduced hybrid cell 
lines (RRHs). For assignment of Fen-1 gene to mouse 
chromosome, we used an established panel of 12 Chinese 25 
hamsterxmouse and 1 ratxmouse somatic cell hybrid lines 
(Yang-Feng ct al., 1986). The hybridization conditions were 
identical to those that we have used previously (Yang-Feng 
et al., 1986). 

The three somatic cell hybrid cell lines (J 1-44, Jl-45, 30 
J 1-46) the 8 radiation -reduced hvbrid cell lines (RRHs, 
R184-1A2, R184-3A1, R184-7CU R184-5D1, R184-4C2, 
R185-1B1, and R131-33B1), and the Southern hybridization 
protocols have been described previously (Gerhard et al., 
1992). The cell line DNA was digested with EcoRI, sepa- 35 
rated by size, immobilized on nylon membranes, and hybrid- 
ized with a 600-bp fragment of the 5' end of the human 
cDNA of FEN-1. 

Chromosome in situ hybridization. Fluorescence in situ 
hybridization of peripheral lymphocyte metaphase chromo- 40 
somes from normal individuals was carried out using biotin- 
11-dUTP (Sigma) labeled human FEN-1 genomic clones 
(HG1-1 and HG5-3). HG1-1 (19 kb), HG2-2 (12 kb), and 
I IG5-3 (17 kb) are three independent clones, each of which 
contains sequences that hybridize with the murine FEN-1 45 
cDNA by Southern blot. Hybridization conditions and 
washes were as described (Spritz et al., 1991). After the final 
wash, the slides were countcrstaincd with 30 //l of anti-fade 
medium (2.3% DABCO, 90% glycerol) containing 200 
ng/ml of propidium iodide (PI) and 200 ng/ml DAPI. Slides 50 
were analyzed using a Zeiss Axiophot microscope. Results 
were directly photographed using Kodak Ektachrome ASA 
400 color film. 

Sequence of cDNA for Human FCN-1 

Full-length cDNA for human FEN-1 was isolated from a 55 
human leukemic T-cell cDNA phage library. The sequence 
of human FENI is shown in FIG. 1. The identity between the 
human FENI cDNA and the mouse cDNA is 95.3% at the 
amino acid level and 89% at the nucleotide level. The amino 
acid similarity is 97.6%. We have confirmed that this human 60 
cDNA clone can be overexpressed in Escherichia coti to 
yield a 50-kDa protein with the same endonucleolytic activ- 
ity that we have recently reported for the murine cDNA 
(Harrington and Lieber, 1994a; and data not shown). 
Isolation of Genomic Clones 65 

Several genomic clones were isolated from the phage 
library screened using the cDNA of mouse FEN-1 as a 
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probe. Three clones, HG1-1, HG2-2, and HG5-3, were 
further characterized. The estimated insert sizes of HG1-1, 
HG2-2, and HG5-3 are 19, 12, and 17 kb, respectively. 
These three clones contain sequences that hybridize with the 

5 cDNA of mouse FEN-1 by Southern blot. With BamHI 
digestion, using the cDNA of mouse FEN-1 as a probe, the 
HG1-1, HG2-2, and HG5-3 clones each have one positive 
fragment of 2.2, 3.0, and 8 kb, respectively. The 2.2- and 
8-kb fragments correspond to the two FEN1 fragments 

10 observed in the B a ml Undigested total human genomic DNA. 
With EcoRI or Hindlll single digest using mouse FENl 
cDNA as a probe, the HG1-1 and HG2-2 clones both have 
a common 3-kb fragment, and each has a smaller fragment 
of dissimilar size by EcoRI digest. Other than this, these 

15 three clones do not contain the same size fragments upon 
EcoRI or Hindll I digestion. However, the fragment sizes are 
consistent with the fragments in total human DNA digested 
with EcoRI or Hindlll when probed with the cDNA of 
mouse FEN-1. Hence, these three clones (HG1-1, -2-2, and 

20 -5-3) do contain FENl DNA sequences. We were interested 
in finding out (1) whether these three clones represent 
different regions of the FENl genomic sequence or (2) 
whether one clone represents the authentic FENl gene and 
the other clones represent a pseudogene or a gene homolo- 

25 gous to FENl. We chose to localize these three clones using 
FISH. 

Chromosomal Assignment of FENl in Human 

When the biotinylated genomic clone HG1-1 is hybrid- 
ized to human metaphase chromosome spreads, fluorescent 

30 signals are observed on sister chromatids of both homo logs 
of chromosome 1 in 40 cells examined, and DAPI banding 
indicates that the specific signals are localized to band p22.2. 
The clone HG2-2 is also localized to chromosome lp22.2 
(data not shown) in 25 cells examined using FISH. FISH 

35 analysis and DAPI counterstaining of metaphase spreads 
hybridized with the biotinylated HG5-3 clone reveal sym- 
metrical fluorescent signals on both homologs of chromo- 
some Ilql2-ql3.1 in 45 cells examined. 

The HG5-3 is the only clone localized to chromosome 11; 

40 therefore, we further confirmed the FENl localization on 
human chromosome 11 using hybrids containing all or 
portions of the human chromosome 11 as the only human 
material. Using the 5' end of FEN-1 human cDNA as a 
probe, a 3.9-kb hum an -specific fragment is observed in 

45 EcoRI-digesled DNA. This band is absent from the cell line 
J 1-44 (deleted for human chromosome Ilql2-ql4), but is 
present in J 1-45 and J 1-46 (deleted for a portion of human 
llql3), indicating that FENl is in human chromosomal 
region llql2 or Ilql3.5-ql4. The FENl -specific band is 

50. present in 6 of 8 radiation-reduced hybrids (RRHs). The two 
cell lines where it is lacking are R185-2C2 and R185-1B1. 
This segregation pattern is identical to that of pepsinogen Al, 
which has been localized to human chromosome 1 lq 12 
(Gerhard et al., 1992) (data not shown). 

55 Chromosomal Assignment of FEN-1 in Mouse 

A Southern blot of Bglll-digested DNA from controls and 
13 somatic cell hybrid lines containing subsets of mouse 
chromosomes was hybridized with 32 P-labeled mouse 
cDNA. A single fragment of 3.8 kb is present in the mouse 

60 control DNA, two fragments of 2.5 and 21.5 kb are observed 
in the Chinese hamster control DNA, and a 8.0-kb band is 
present in the rat control DNA. The 3.8-kb mouse band is 
observed in all hybrids containing mouse chromosome 19. 
All other mouse chromosomes are excluded by at least two 

65 discord ant hybrids. 
£7 *2-;>^-5 — The foregoing description of the preferred embodiments 
of the present invention has been presented for purposes of 
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illustration and description. They are not intended to be Such modifications and variations which may be apparent 

exhaustive or to limit the invention to the precise form to a person skilled in the art are intended to be within the 

disclosed, and many modifications and variations are pos- scope of this invention, 
sible in light of the above teaching. 



( 1 ) GENERAL INFORMATION: 

( i i i ) NUMBER OF SEQUENCES: 63 

( 2 ) INFORMATION FOR SEQ ID NO:l: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 380 amino acids 
( B ) TYPE: amino acid 
( C ) ST HANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Met Gly lie Gin Gly Leu Ala Lys Leu Itc Ala Asp Val Ala Pro Ser 
1 5 10 15 

Ala lie Arg Glu Asn Asp lie Lys Ser Tyr Phc Gly Arg Lys Val Ala 
2 0 2 5 3 0 

lie Asp Ala Ser Met Ser lie Tyr Gin Phc Leu lie Ala Val Arg Gin 
3 5 4 0 4 5 

Gly Gly Asp Val Leo Gin Asn Gin Glu Gly Glu Thr Thr Ser His Leu 
5 0 5 5 6 0 

Mel Gly Met Phe Tyr Arg Thr lie Arg Met Met Glu Asn Gly lie Lys 

65 70 75 80 

Pro Val Tyr Val Phe Asp Gly Lys Pro Pro Gin Leu Lys Ser Gly Glu 
8 5 9 0 9 5 

Leu Ala Lys Arg Ser Glu Arg Arg Ala Glu Ala Glu Lys Gin Leu Gin 
1 0 0 1 0 5 1 1 0 

Gin Ala Gin Ala Ala Gly Ala Glu Gly Glu Val Glu Lys Phe Thr Lys 
115 120 125 

Arg Leu Val Lys Val Thr Lys Gin His Asn Asp Glu Cys Lys His Leu 
13 0 ' 13 5 14 0 

Leu Ser Leu Met Gly tie Pro Tyr Leu Asp Ala Pro Ser Glu Ala Glu 
145 150 155 160 

Ala Ser Cys Ala Ala Leu Val Lys Ala Gly Lys Val Tyr Ala Ala Ala 
1 6 5 1 7 0 1 7 5 

Thr Glu Asp Met Asp Cys L e u Thr Phe Gly Ser Pro Val Len Met Arg 
18 0 18 5 19 0 

His Leu Thr Ala Scr Glu Ala Lys Lys Leu Pro lie Gin Glu Phc His 
195 200 205 

Leu Ser Arg lie Leu Gin Glu Leu Gly Leu Asn Gin Glu Gin Phe Val 
2 1 0 2 1 5 2 2 0 

Asp Leu Cys lie Leu Leu Gly Ser Asp Tyr Cys Glu Ser lie Arg Gly 
225 230 235 240 

lie Gly Pro Lys Arg Ala Val Asp Leu lie Gin Lys His Lys Ser tie 
245 250 255 

Glu Glu lie Val Arg Arg Leu Asp Pro Asn Lys Tyr Pro Val Pro Glu 
260 265 270 

Asn Trp Leu His Lys Glu Ala His Gin Leu Phe Leu Glu Pro Glu Val 
275 280 285 

Leu Asp Pro Glu Ser Val Glu Leu Lys Trp Ser Glu Pro Asn Glu Glu 
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Glu L e n 11c Lys Phe Met C y s Gly Glu Ly s Gin Phe Ser Glu G I u Arg 
305 310 315 320 

lie Arg Ser Gty Val Lys Arg Leu Ser Ly s Scr Arg Gin Gly Scr Thr 
325 330 335 

Gin Gly Arg Leu Asp Asp Phc Pbc Lys Val Thr Gly Ser Leu Ser Ser 
340 345 350 

Ala Lys Arg Lys Glu Pro Glu Pro Lys Gly Ser Thr Lys Lys Lys Ala 

355 360 365 

Lys Thr Gly Ala Ala Gly Lys Phe Lys Arg Gly Lys 
370 375 380 

( 2 ) INFORMATION FOR SEQ ID NO:2; 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1144 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGGGAATTC AAGGCCTGGC CAAACTAATT GCTGATGTGG CCCCCAGTGC CATCCGGGAG 60 

AATGACATCA AGAGCTACTT TGGCCGTAAG GTGGCCATTG ATGCCTCTAT GAGCATTTAT 120 

CAGTTCCTGA TTGCTGTTCG CCAGGGTGGG GATGTGCTGC AGAATGAGGA GGGTGAGACC 180 

ACCAGCCACC TGATGGGCAT GTTCTACCGC ACCATTCGCA TGATGGAGAA CGGCATCAAG 240 

CCCGTGTATG TCTTTGATGG CAAGCCGCCA CAGCTCAAGT CAGfiCGAfiCT GGCCAAACGC 3 0 0 

AGTGAGCGGC GGGCTGAGGC AGAGAAGCAG CTGCAGCAGG CTCAGGCTGC TGGGGCCGAG 360 

CAGGAGGTGG AAAAATTCAC TAAGCGGCTG GTGAAGGTCA CTAAGCAGCA CAATGATGAC 420 

TGCAAACATC TGCTGAGCCT CATGGGCATC CCTTATCTTG ATGCACCCAG TGAGGCAGAG 4 8 0 

GCCAGCTGTG CTGCCCTGGT GAAGGCTGGC AAAGTCTATG CTGCGGCTAC CGAGGACATG 540 

GACTGCCTCA CCTTCGGCAG CCCTGTGCTA ATGCGACACC TGACTGCCAG TGAAGCCAAA 600 

AAGCTGCC A A TCCAGGAATT CCACCTGAGC CGGATTCTGC AGGAGCTGGG CCTGAACCAG 660 

GAACAGTTTG TGGATCTGTG CATCCTGCTA GGCAGTGACT ACTGTGAGAG TATCCGGGGT 720 

ATTGGGCCCA AGCGGGCTGT GGACCTCATC CAGAAGCACA AGAGCATCGA GGAGATCGTG 780 

CGGCGACTTG A C C C C A A C A A GTACCCTGTG CCAGAAAATT GGCTCCACAA GGAGGCTCAC 8 4 0 

CAGCTCTTCT TGGAACCTGA GGTGCTGGAC CCAGAGTCTG TGGAGCTGAA GTGGAGCGAG 900 

CCAAATGAAG AAGAGCTGAT CAAGTTCATG TGTGGTGAAA AGCAGTTCTC TGAGGAGCGA 960 

ATCCGCAGTG GGGTCAAGAG GCTGAGTAAG AGCCGCCAAG GCAGCACCCA GGGCCGCCTG 1020 

GATGATTTCT TCAAGGTGAC CGGCTCACTC TCTTCAGCTA AGCGCAAGGA GCCAGAACCC 1080 

AAGGGATCCA CTAAGAAGAA GGCAAAGACT GGGGCAGCAG GGAAGTTTAA AAGGGGAAAA 1140 

T A A A 114 4 

( 2 ) INFORMATION FOR SEQ ID NO:3; 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 377 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 
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( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



Met G I y lie His Cly Leu Ala Lys 



Ala lie A r g Glu Asa Asp lie Lys 
2 0 



Ser 

2 5 



lie Ala Asp 

1 0 

T y r Pbe G I y 



V a 1 Ala 



A r g 



L y s 
3 0 



P r o 
1 5 



V a 1 Ala 



lie Asp Ala Scr Met Ser 

3 5 



T y r 
4 0 



Gin Phc Leu lie 



A I a 
4 5 



V a 1 Arg G 1 d 



Gly Gly Asp Val Leu Gl 
5 0 



A s i 
5 5 



Glu Glu Gly Glu 



T h r 

6 0 



Thr Scr Leu Met 



Gly Met Phe Ty r Arg Thr lie Arg Met Glu 



A S D 

7 5 



Gly lie Lys Pro 



V a I 
8 0 



Tyr Val Phe Asp Gly Lys 
8 5 



Lys Arg Ser Glu Arg Arg 
1 0 0 



Pro Pro Gin 



Ala Glu 



L e u 

9 0 



A I a 
10 5 



Ly s 
L y s 



Ser Gly Glu 



Gin Leu 



G 1 n 
1 1 0 



Leu 

9 5 



Gin Ala 



Gin Glu Ala Gly Met Gtu 
1 1 5 



Va I 
1 2 0 



Glu Lys Phe Thr 



Lys 
1 2 5 



Arg Leu Va 



yi 



Lys Val Thr Lys Gin His 
13 0 



A s n 
I 3 5 



Asp Glu Cys Lys 



H i s 
14 0 



Leu Leu Ser Leu 



Met Gly lie Pro Tyr Leu Asp Ala Pro Ser 



G 1 u 

1 5 5 



Ala Glu Ala Ser 



C y s 
1 6 0 



Ala Ala Leu Ala Lys Ala Gly 

16 5 



Lys Val 



T y r 
17 0 



Ala Ala Ala Thr 



G I u 
1 7 5 



Asp 



Met Asp Cys Leu Thr Phe Gly 
1 8 0 



P r o 
1 8 5 



Val Leu Met Arg 



H i s 
1 9 0 



Leu Thr 



Ala Scr Glu Ala Lys Lys 
1 9 5 



Val Leu Gin Glu Leu Gly 
2 I 0 



Leu 



Leu 

2 15 



P r o 
2 0 0 



lie Gin Gtu Phc 



Asn Gin Glu Gl 



H i s 
2 0 5 



Phc 

2 2 0 



Leu Scr 



Val Asp Leu 



A r g 

Cy s 



lie Leu Leu Gly Ser Asp Tyr Cys Glu Ser 



1 1 e 

2 3 5 



Arg Gly lie Gly 



A I a 

2 4 0 



Lys Arg Ala Val Asp Leu 
2 4 5 



lie Gin Lys 



H i s 
2 5 0 



Lys Ser lie Glu 



G I u 
2 5 5 



Val Arg Arg Leu Asp Pro 
2 6 0 



His Lys Glu Ala Gin Gin 
2 7 5 



Scr Lys 



Phe 

2 8 0 



Ty r 

2 6 5 



Pro Val Pro Glu 



Leu Glu Pro Glu 



Asn 
2 7 0 



V a 1 
2 8 5 



T r p 
Asp 



Glu Ser Val Glu Leo Ly: 
2 9 0 



T r p 

2 9 5 



Ser Gtu Pro Asn 



G I u 
3 0 0 



Glu Glu Leu Val 



Lys Phe Met Cys Gly Glu Lys 



Gly Val Lys Arg Leu Ser Lys 
3 2 5 



Leu Asp Asp Phe Phe Lys 
3 4 0 



Lys Glu Pro Glu Pro Lys Gly 
3 5 5 



Gin Phe Ser 



Ser Arg 



Val Thr 



G 1 y 
3 4 5 



S e r 
3 6 0 



G I n 
3 3 0 



G 1 u 
3 1 5 



Gly Ser 



Arg 
T h r 



Ser Leu Scr Ser 



Ala Lys Lys Lys 



A I a 

3 6 5 



lie Arg 



G I y 
3 3 5 



A I a 

3 5 0 



Ly s 



Lys Thr 



S c r 

3 2 0 

Ar g 

Ar g 

G 1 y 



Gly Ala Gly Lys Phe Arg 
3 7 0 



A r g 

3 7 5 



Gly Lys 



( 2 ) INFORMATION FOR SEQ ID NO:4: 



( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1930 base pairs 
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( B ) TYPE: nucleic acid 

( C ) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATGGGAATTC ACGGCCTTGC CAAACTAATT GCTGATGTGG CCCCCAGTCC CATCCGTGAG 60 

AATGACATCA AGAGCTACTT TGGTCGTAAA GTGGCCATCG ATGCCTCCAT GAGCATCTAC 120 

CAGTTCCTGA TTGCTGTTCG TCAGGGTGGG GATGTGCTGC AGAACGAGGA GGGTGAGACC 180 

ACCAGCCTGA T G GfiCATGTT ATGGCAAACC ATCCGCATGG AGAATGGCAT CAAGCC T GTG 2 4 0 

TACGTCTTTG ATGGCAAACC ACCACAGCTG AAGTCAGGCG AGCTGGCCAA GCGCAGTGAG 300 

AGGCGCGCCG AGGCTGAGAA GCAACTGCAG CAGGCTCAGG AGGCTGGGAT GGAGGAGGAG 360 

GTGGAGAAGT TCACCAAGAG GCTCGTGAAG GTCACCAAGC AACACAATGA TGAGTGCAAA 420 

CACCTCGTGA GCCTCATGGG CATCCCTTAC CTTGATGCAC CCAGCGAGGC AGAGGCCAGC 480 

TGTGCTGCCC TGGCAAAGGC TGGCAAAGTC TATGCTGCGG CCACGGAGGA CATGGACTGC 540 

CTCACTTTTG GCAGCCCCGT G C T A A T G C G A CACTTAACTG CCAGTGAGGC C A A G A A G C T G 6 0 0 

CCCATCCAAG AGTTCCATCT GACCCGCGTC CTGCAGGAGC TGGGTCTGAA CCAGGAGCAG 660 

TTTGTGGATC TGTGCATCCT GCTGGGTAGC GACTACT GCG AGAGCATCCG TGGCATTGGC 720 

GCCAAflCGGG CTGTGGATCT CATCCAGAAA CATAAGAGCA TCGAGGAGAT CGTGAGGCGG 7 8 0 

CTGGACCCCA GCAAGTACCC CGTTCCAGAG AACTGGCTCC A CAAGGAAGC CCAGCAGCTC 840 

TTCCTGGAGC CAGAAGTAGT GGACCCAGAG TCTGTGGAGC TGAAGTGGAG CGAGCCAAAT 900 

GAAGAAGAGT T GGTCAAATT TATG T G T G G T G A A A A G C A G T TTTCTGAAGA GCGA ATTCGC 9 6 0 

AGTGGGGTCA AGCGGCTGAG TAAGAGCCGC CAGGGCAGCA CCCAGGGACG CCTCGATGAT 1020 

TTCTTCAAGG TGACAGGCTC ACTCTCCTCA GCTAAGCGCA AGGAGCCAGA ACCCAAGGGG 1080 

CCTGCTAAGA AGAAAGCAAA GACTGGGGGA GCGGGGAAGT TCCGAAGGGG AAAATAAACC 1140 

TGTCCTTCCC CTCCACTGTC CTTGACCCCA GGCTGTCTAT CTGTTTTGTA CCCTGCGCTG 1200 

CAGCACATCC CTCTTGTCCC TCGTCTTGAG GAGAGTTCAT TGCTTCCAGC GCTCGCCTTC 1260 

AGAGCTTTCC CTCTCTTGAC CCTGTGGCAG G A A G G C C G T A GCTCTGCTTT T T C T C A T T T T 13 2 0 

TAGCTCAGGA AAGATGTCAG GCTCAAACCA CTTCTCAGGT TAATGGACAC TGTAGTCATT 1380 

GTTCTGTGCA ACTGCGAGCA ATGTCTTAAG GAAGAAGAAG ATAAAGCCGG GAGCGAGGCT 1440 

GGAGATAGTT TCCCAGC T G G CCAGCTGGTG GAGGAGAGGT G A C T A G A A C C TGACTGACTA 15 0 0 

CTGCTCCTTC TAATTTCACT GTCCCTGAAA GATGCCCATC AGCCTGGGAT TCGCTGATGG 1560 

AAGAACTGCA AAGAGACGCA GCAGAGAGAA GTCTGGCTGA CAACAGATTT AGTACTGACC 1620 

AGCTGATTTT TGTGGGCAGA AATTTGAACT TGCTGCCTGC TGAGTCCAGT AGTTGTGCAG 1680 

GGAGTGAGAT GGCAGTGTTT AAGTTTTGAT TTGTAGTTTT TTGTTTTTGT CTCTCCCCTC 1740 

TCCAGTGTTG GGGATTGACC CCAGGGCAAA GGCATTAAGT GTGCCACTGA CCTGTGCCTC 1800 

CAAGTGATGT TCTGACAGCC T TTCTGAGGC A A T C A A T T G A ATTGAGGTTT TGGGAGAAGA 18 6 0 

AACTGTTGTT CATAGGCTAT TTCTATTTTA AAAGATGTGA ACAGAAAAAA AAAACAATAA 1920 

AATTATAAAA 1930 

( 2 ) INFORMATION FOR SEQ ID NO:5: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 382 amino acids 
( B ) TYPE; amino acid 
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( C ) STRAND EDNESS: single 
( I) ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



Mel Gly lie Ly s Gly Leu Asn Ala lie lie Ser Glu His Val Pro Ser 



Ala lie Arg Ly s Ser 
2 0 



Asp lie Lys Ser Phe Phe Gly Arg Ly j Val Al 



lie Asp Ala Ser Met Ser Leu Tyr Gin Phe Leu lie Ala Val Arg Gl: 



Asp 
5 0 



Leu 

6 S 



Lys Pro 



Ala Gli 



G 1 y 
G 1 y 
C y s 



Gly Gin Leu 



Met Ph. 



Glu Leu Thr 



A I a 
1 1 5 



Ty r 

7 0 



Ty . 



Lys 
1 0 0 



V a 1 

8 5 



T b r 

5 5 



Ar g 
Asp 



Arg Scr Scr 



Asn Glu Ala Gly 



Thr Leu Arg 



Thr Thr Glu Leu 



G I y 
A r g 



G 1 u 
12 0 



Lys 



Arg 
I 0 5 



P r o 
9 0 



G I u 
6 0 



M e I 

7 5 



Thr Thr Ser Mis 



lie Asp Asn Gly 



Pro Asp Leu Lys 



Val Glu Thr Gl 



Lys Met Lys Gin 



L y s 
1 1 0 



G 1 u 
12 5 



I 1 e 

8 0 



S e r 
9 5 



Ly s 
Arg 



L y s 
1 3 0 



Val Scr Lys G 1 u 



H i s 
1 3 5 



Asn Glu Cilu Ala 



G I n 
1 4 0 



Ly s 



Leu (il y 



Leu 
L 4 5 



Cys Ala 



G I y 
G I u 



Asp Met Asp 



T h r Phe 



Va I 
2 1 0 



Cys 
2 2 5 



S e r 
1 9 5 



lie Met Leu 



lie Val Glu 



lie Pro 



G I u 

2 9 0 



G I u 

2 7 5 



Lys 
3 0 5 



G I y 
L y s 



A I a 

16 5 



T h r 
1 8 0 



Glu Ala 



Leu Arg 



Pro Val Thr Ala 



G I y 
G I y 



Leu 
2 4 5 



Ty r 
1 5 0 

Lys 
Cy s 
L y s 
L e i 



lie lie Ala Pro 



Phe 

2 6 0 



L y s 
Ty r 

L y s 



Cys 
2 3 0 



Asp Trp Pro Tyr 



Val lie Asp Gly 



Asn 
2 9 5 



i Lys 
Glu Glu Arg 



Glu Leu 



Ly s 
3 2 5 



I I e 

3 1 0 



G 1 y 
A r g 



G I u 

2 0 0 



Asp 

2 1 5 



Asp Tyr 



Lys Leu lie 



Lys 



T h T 
1 8 5 



V a I 

1 7 0 



Thr 

1 5 5 



T y r 
Phe 



Glu Ala Glu Ala 



lie Glu Ser Gly 



Cys 
L y s 



G 1 u 
2 6 5 



Ly s 
2 8 0 



Ser Gly lie Ser 



I I e 



G I u 
3 5 5 



G 1 n 
3 4 0 



Gly Arg Leu Asp 



Gin Leu Ala Ala 



A I a 

3 6 0 



Thr 
2 5 0 



Gin Ala 



Glu lie Asn 



Glu Tyr Leu Cys 



Arg 

3 3 0 



Ala Ala Ala 



Leu Leu 



Leu Thr lie Glu 



G 1 y 
3 4 5 



Ser 

2 3 5 



A r g 

19 0 



P r u lie His Glu 



1 1 e 

2 0 5 



G I n 
2 2 0 



lie Arg Gly Val 



His Gly Ser lie 



L y s 

2 7 0 



Ar g 
Leu 



Asp 
3 I 5 



Ly s 
3 0 0 



Asp 
Lys 



L e u 

2 8 5 

T r p 
Lys 
L y s 



Phe Phe Gin Val 



Ala Lys Arg Ala 



G I n 

3 6 5 



G 1 u 
2 5 5 



Tr p 



Phe Leu 



Lys Phe 



Leu 

3 3 5 



V a 1 

3 5 0 



G I n 
1 6 0 



S e r 
1 7 5 



His Leu 



Asp Thr G I 



Phe Val Asp Leu 



G 1 y 
2 4 0 

Ly s 
Lys 
Asp 



Ser Pro Pro 



S e r 
3 2 0 

Lys 
Ly s 
Lys 



Lys Leu Asn Lys Asn Lys 
3 7 0 



Asn 

3 7 5 



Lys Val Thr Lys Gly 

3 8 0 



A r g 
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( 2 ) INFORMATION FOR SEQ ID NO:6: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1149 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE cDNA 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

ATGGGTATTA AAGGTTTGAA TGCAATTATA TCGGAACATG TTCCCTCTGC TATCAGGAAA 60 

A G C G A T A T C A AGAGCTTTTT TGGCAGAAAG GTTGCCATCG ATGCCTCTAT GTCTC T A T AT 12 0 

CAGTTTTTAA TTGCTGTAAG ACAGCAAGAC GGTGGGCAGT TCACCAATGA AGCCGGTGAA 180 

ACAACGTCAC ACTTGATGGG TATGTTTTAT AGGACACTGA GAATGATTGA TAACGGTATC 240 

AAGCCTTGTT ATGTCTTCGA CGGCAAACCT CCAGCTTTGA AATCTCATGA GTTGACAAAG 300 

CGGTCTTCAA GAAGGGTGGA AACAGAAAAA AAACTGGCAG AGGCAACAAC AGAATTGGAA 360 

AAGATGAAGC AAGAAAGAAG ATTGTTGAAG GTCTCAAAAG AGCATAATGA AGAAGCCCAA 420 

AAATTACTAG GACTAATGGG AATCCCATAT ATAATAGCGC CAACGGAAGC I G A G G C T C A A 4 8 0 

TGTGCTGAGT TGGCAAAGAA GGGAAAGGTG TATGCCCCAG CAAGTGAAGA TATGGACACA 540 

CTCTGTTATA GAACACCCTT CTTGTTGAGA CATTTGACTT TTTCAGAGGC CAAGAAGGAA 600 

CCGATTCACG AAATAGATAC TGAATTAGTT TTGAGAGGAC TCGAC T TGAC AATAGAGCAG 6 6 0 

TTTGTTGATC TTTGCATAAT GCTTGGTTGT GACTACTGTG AAAGCATCAG AGGTGTTGGT 720 

CCAGTGACAG CCTTAAAATT GATAAAAACG CATGGATCCA TCGAAAAAAT CGTGGAGTTT 780 

ATTGAATCTG GGGAGTCA A A CAACACTAAA TGGA AAATCC CAGAAGACTG GCCTT ACAAA 840 

CAAGCAAGAA TGCTGTTTCT TGACCCTGAA GTTATAGATG GTAACGAAAT AAACTTGAAA 900 

TGGTCGCCAC CAAAGGAGAA GGAACTTATC GAGTATTTAT GTGATGATAA GAAATTCAGT 960 

G A AG A A AG AG TTAAATCTGG TAT AT C A AG A TTGAAAAAAG GCTTGA AATC TGGCATTCAG 102 0 

GGTAGGTTAG ATGGGTTCTT CCAAGTGGTC CCTAAGACAA AGGAACAGCT GGCTGCTGCG 1080 

GCGAAAAGAG CACAAGAAAA TAAAAAATTG AACAAAAATA AGAATAAAGT CACAAAGGGA 1140 

AGAAGATGA 1149 

( 2 ) INFORMATION FOR SEQ ID NO: 7: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 386 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

Met G I y His Val Ser Phe Trp Asp lie Ala Gly Pro Thr Ala Arg Pro 
1 5 10 15 

Val Arg Leu Glu Ser Leu Glu Asp Ly s Arg Met Ala Vol Asp Ala Ser 
2 0 2 5 3 0 

lie Trp lie T y r Gin Phe Leu Lys Ala Val Arg Asp Gin Glu Gly Asn 
3 5 4 0 4 5 

Ala Val Lys Asn Ser His lie Thr Gly Phe Phe Arg Arg lie Cys Lys 
5 0 5 5 * 6 0 

Leu Leu Tyr Phe Gly lie Arg Pro Val Phe Val Phe Asp Gly Gly Val 
65 70 75 80 
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Pro Val Leo Ly s Arg Glu Thr lie Arg Gin Arg Lys G I u Arg Arg Gin 

8 5 9 0 9 5 

Gly Lys Arg Glu Ser Ala Lys Ser Thr Ala Arg Lys Leo Gin Gin Gin 
100 105 110 

Met Lys Asp Lys Arg Asp Ser Asp Glu Val Thr Met Asp Met lie Lys 
115 12 0 12 5 

Glu Val Gin Glu Leu Leu Ser Arg Phc Gly lie Pro Tyr lie Thr Ala 
13 0 13 5 14 0 

Pro Met Glu Ala Glu Gin C y s Ala Glu Leu Leu Gin Leu Asn Leu Val 
145 150 155 160 

Asp Gly lie lie Thr Asp Asp Ser Asp Val Phe Leu Phe Gly Gly Thr 

1 6 5 1 7 0 1 7 5 

Lys lie Tyr Lys Asn Met Phe His Glu Lys Asn Tyr Val Glu Phe Tyr 
180 185 190 

Asp Ala Glu Ser Ser lie Leu Lys Leu Leu Gly Leu Asp Arg Lys Asn 
195 200 205 

Met lie Glu Leu Ala Gin Leu Leu Gly Ser Asp Tyr Thr Asn Gly Leu 

210 215 * 220 

Lys Gly Met Gly Pro Val Ser Ser tie Glu Val lie Ala Glu Phe Gly 

225 230 235 240 

Asn Leu Lys Asn Phe Lys Asp Trp Tyr Asn Asn Gly Gin Phe Asp Lys 

24 5 250 25 5 

Arg Lys Gin Glu Thr Glu Asn Lys Phe Glu Lys Asp Leu Arg Lys Lys 
260 265 270 

Leu Val Asn Asn Glu tic Leu Leu Asp Asp Asp Phe Pro Ser Val Met 

275 280 285 

Val Tyr Asp Ala Tyr Met Arg Pro Glu Vat Asp His Asp Thr Thr Pro 
290 295 300 

Phe Val Trp Gly Val Pro Asp Leu Asp Met Leu Arg Ser Phc Met Lys 

305 310 315 320 

Thr Gin Leu Gly Trp Pro His Glu Lys Ser Asp Glu lie Leu lie Pro 

325 330 335 

Leu lie Arg Asp Val Asn Lys Arg Lys Lys Lys Gly Lys Gin Lys Arg 
340 34S' 350 

lie Asn Glu Phe Phe Pro Arg Glu Tyr lie Ser Gly Asp Lys Lys Leu 
355 360 365 

Asn Thr Ser Lys Arg lie Ser Thr Ala Thr Gly Lys Leu Lys Lys Arg 
370 375 380 

Lys Met 
3 8 5 

( 2 ) INFORMATION FOR SEQ ID NO:8: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1161 base pairs 
( B ) TYPE; nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

ATGGGTGTGC ATTCATTTTG GGATATTGCA GGTCCTACGG CAAGACCGGT CAGGCTGGAA 60 

TCCTTGGAAG ATAAGAGAAT GGCAGTAGAT GCCTCCATTT GGATATATCA GTTTTTGAAA 120 

GCTGTCCGTG ATCAGCAGGG GAATGCAGTG AAGAATTCTC ATATTACTGG GTTCTTTAGA 180 

AGAATTTGTA AGCTATTATA CTTTGGCATT AGGCCGGTAT TCGTCTTTGA TGGTGGTGTG 240 
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CCCGTATTOA AAAGGGAAAC AATACGGCAG AG G AAA G AAA GAAGACAGGG AAAACGAGAG 300 

AGTCCGAAAT CCACCGCTAG GAAGCTGCAA CAACAGATGA AGGATAAAAG AGATTCGGAT 360 

GAGGTAACTA TGGATATGAT CAAAGAAGTG CAAGAATTAC TATCGAGGTT TGGAATCCCC 420 

TATATCACTG CGCCTATGGA AGCTGAAGCA CAGTGTGCGG AATTGTTACA ACTAAACCTT 480 

GTCGATGGTA TAATTACCGA TGACAGTGAT GTTTTCCTAT TTGGAGGTAC AAAGATCTAC 540 

AAAAATATGT TCCACGAAAA GAACTATGTT GAATTTTATG ATGCGGAATC TATTTTAAAA 600 

TTATTGGGCT TGGATAGAAA GAATATGATT GAGTTGGCAC AGCTTTTAGG GAGCGATTAC 6 6 0 

ACGAATGGAT TGAAGGGTAT GGGTCCCGTT TCAAGCATTG AAGTGATTGC AGAATTTGGA 720 

AACCTAAAAA ATTTTAAAGA CTGGTATAAT AATGGGCAGT TTGATAAACG TAAGCAAGAA 780 

ACGGAAAATA AATTTGAAAA AGACCTGAGA AAAAAACTGG TAAATAACGA AATTATCTTA 840 

GATGATGATT TTCCTAGCGT CATGGTTTAT GATGCGTATA TGAGACCAGA AGTCGATCAC 900 

GATACCACGC CGTTTGTTTC GGGGGTACCA GATCTCGATA TGCTTCGTTC ATTCATGAAG 960 

ACTCAACTAG GTTGGCCACA CGAAAAGTCT GATGAAATTC TCATTCCCTT AATTAGAGAT 1020 

GTT AATAAAC GCAAAAAGAA GGGGAAGCAA AAAAGGATTA ATGAATTTTT TCCAAGGGAG 1080 

TACATATCTC GTGATAAGAA GCTCAATACA AGTAACAGAA TTTCAACCGC AACAGGTAAA 1140 

CTAAAGAAAA GAAAGATGTA A 1161 

( 2 ) INFORMATION FOR SEQ ID NO:9: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 2033 base pains 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i j ) MOLECULE TYPE: DNA (genomic) 

( ij) FEATURE: 

( A ) NAME/KEY: CDS 

( B ) LOCATION: 104..1237 

( x j ) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

TCGCGGA AGC TGTGAAAGCG GCAGACGGAA CAGCACCGGG CTAGCCCGGC TTTGGCCATT 6 0 

CTGCTCCGAA CATTCCTATT GTTGCCATTG CTCCTGTGCT ACC ATG G A A ATT CAC 115 



G G C C T T G C C AAA C T A ATT OCT GAT GIG G C C C C C A G T G C C ATC COT GAG 

Gly Leu Ala Lys Leu lie Ala Asp Val Ala Pro Ser Ala lie Arg Glu 

5 10 15 2 0 

A A T GAC ATC A A G AGC TAC TTT GGT CGC AAA GTG GCC ATC GAT GCC TCC 

Asn Asp lie Lys Ser Tyr Pbe Gly Arg Lys Val Ala lie Asp Ala Ser 
2 5' 3 0 3 5 

ATG AGC ATC TAC CAG TTC CTG ATT GCT GTT CGT CAG GGT GGG GAT GTG 

Met Ser lie Tyr Gin Phe Leu lie Ala Val Arg Gin Gly Gly Asp Val 
4 0 4 5 5 0 

CTG CAG A A C GAG GAG GGT GAG ACC ACC AGC CTG ATG G G C ATG TTC TAC 

Leo Gin Asn Gin Glu Gly Glo Thr Thr Ser Leu Met Gly Mel Phe Tyr 

5 5 6 0 6 5 

CGT ACC ATG CGC ATG GAG A A T GGC ATC A A G CCT GTG TAC GTC TTT GAT 

Arg Thr Met Arg Met Glu A&n Gly lie Lys Pro Val Tyr Val Phe Asp 

7 0 7 5 8 0 

GGC AAA CCA CCA CAG CTG A A G TCA GGC GAG CTG GCC A A G CGC AGT GAG 

Gly Lys Pro Pro Gin Leu Lys Ser Gly Glu Leu Ala Lys Arg Ser Glu 

85 90 95 100 
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AGG CGC GCC GAG GCT GAG A A G C A A CTG CAG CAG 



Arg Ala G I u 



A I a 

16 5 



V □ 1 
2 4 5 



A t g 

3 2 5 



G 1 u Lys G ) n Lei 



G I n 
1 1 0 



ATG GAG GAG GAG GTG GAG A A G TTC ACC A A G AGG 



Met Gl o Gl u Glu 

1 2 0 



V a I Glu Lys 



P h c T h r 
1 2 5 



Lys A r g 



A A G C A A CAC A A T GAT GAG TGC AAA CAC CTG CTG 



Lys Gin 



H i s 
1 3 5 



A s n Asp 



Cy s 



Ly s 
14 0 



Leu Leu 



CCT TAC CTT GAT GCA CCC AGC GAG GCA GAG GCC 



Ty r 
1 5 0 



Asp Ala Pro 



S c r 
1 5 5 



GCA A A G GCT GGC AAA GTC TAT GCT CCG GCC ACG 



L y s 



G I y 



Lys V a 1 
1 7 0 



T y r 



Ala Ala Ala Tbr 
1 7 5 



Leu Thr Pbc Gly 



S e i 
1 8 5 



V a I Leu Met 



Arg 
19 0 



Ala Lys Lys 



Leu 

2 0 0 



I c Glu Glu 



Pbc 

2 0 5 



His Leu 



Glu Leu 



G 1 y 
2 15 



Leu Asn Gin 



G I n 

2 2 0 



Phe Val Asp 



S e r 
2 3 0 



Tyr Cys Glu 



S e r 
2 3 5 



Gly lie 



Asp Leu Itc Gin 



L y s 
2 5 0 



Lys S c r 



G 1 u 

2 5 5 



Leu Asp Pro Ser 



Lys Tyr 
2 6 5 



Pro Val Pro 



G I u 

2 7 0 



Gin Gin 



Leu 

2 8 0 



Phe Leu 



G I u 

2 8 5 



L y s 
2 9 5 



Pro Asn 

3 0 0 



Glu Glu Gl 



Cys Gly 
3 10 



Glu L y : 



Gin P h i 



Phe 
3 1 5 



Glu Glu 



CGG CTG AGT A A G AGC CGC CAG GGC AGC ACC CAG 



Leu Ser Lys Ser 



A i g 

3 3 0 



G 1 y 



Ser Thr Gin 
3 3 5 



Phe Phe Lvs Val 



Thr Gly 
3 4 5 



Ser Leu Ser 



S e r 
3 5 0 



Glu Pro 



Lys Gly 
3 6 0 



Ala Lys Lys 



Ly s 
3 6 5 



Ly s 



GCT 
A I a 



CT C 
Leu 



AGC 
S e r 



AGC 
S c r 
1 6 0 

GAG 
G I u 



CTC ACT TTT GGC AGC CCC GTG CTA ATG CGA CAC 



GCC A A G A A G CTG CCC ATC CAA GAG TTC CAT CTG 



GAG CTG GGT CTG AAC CAG GAG CAG TTT GTG GAT 



GOT AGC G A C TAC TGC GAG AGC ATC C G T GGC ATT 



GTG GAT CTC ATC CAG AAA CAT A A G AGC ATC GAG 



CTG GAC CCC AGC A A G TAC CCC GTT CCA GAG AAC 



GCC CAG CAG CTC TTC CTG GAG CCA G A A GTA GTG 



GAG CTG A A G TGG AGC GAG CCA A A T G A A G A A GAG 



TGT GGT G A A A A G CAG TTT TTT G A A GAG CGA ATT 



TTC TTC A A G GTG ACA GGC TCA CTC TCC TCA GCT 



G A A CCC A A G GGG CCT GCT A A G A A G AAA GCA A A G 



CAG CAG GCT 

Gin G I n Ala 
1 1 5 

GTG A A G GTC 

Val Lys Val 
1 3 0 

CTC ATG GGC 

Leu Met Gly 



TGT GCT GCC 
Cys Ala Ala 



GAC ATG GAC 
Asp Mel Asp 



TTA ACT 
Leu Thr 



AGC CGC 
Ser A r g 



CTG TGC 
Leu Cys 



GGC GCC 
Gly Ala 



GAG ATC 
Glu lie 



TGG CTC 
T r p Leu 



GAC CCA 
Asp Pro 



T T G GTC 
Leu Val 



CGC AGT 
A i g Ser 



G G A CGC 
Gly Arg 



GCC AGT 
Ala Ser 



GTC CTG 
Val Leu 



ATC CTG 
lie Leu 



A A G CGG 
Lys A r g 



GTG AGG 
Val A r g 



CAC 

ii ; s 



GAG 
G 1 u 
2 9 0 



A A G 
Lys 
2 7 5 

TCT 
S e r 



AAA TTT 
Lys Phe 



GGG GTC 
Gly Val 



CTC G A T 
Leu Asp 



A A G TTC CGA AGG GGA AAA TAAACCTGTC CTTCCCCT 
Lys Phe Arg Arg Gly Lys 
3 7 5 



A A G CGC A A G GAG 
Lys Arg Lys Glu 

3 5 5 

ACT GGG GGA GCG 
Tbr Gly Gly Ala 
3 7 0 

C C ACTGTCCTTG 



GGG 
G I y 



ACC 
T h r 



ATC 
I I e 



CTG 
Leu 



TGC 
Cy s 
1 8 0 

GAG 
G I o 



CAG 
G I n 



CTG 
Leu 



GCT 
A I a 



CGG 
A i g 



G A A 
G I u 



G T G 
Va I 



ATG 
Me 1 



A AG 
Ly s 



G A T 
Asp 
3 4 0 

CCA 
P r o 



GGG 
G I y 



ACCCCAGGCT GTCTATCTGT TTTGTACCCT CGGCTGCAGC 
CTTGAGGAGA GTTCATTGCT TCCAGCGCTG CCCTTCAGAG 
TGGCAGGAAG GCCGTAGCTC T GCT TTT TCT CATTTTTAGC 



ACATCCCTCT TGTCCCTCGT 1327 
CTTTCCCTCT CTTGACCCTG 1387 
TCAGGAAAGA TGTCAGGCTC 1447 



AAACCACTTC TCAGGTTAAT GGACACTGTA GTCATTGTTC TGTGCAACTG CGAGCAATGT 1507 
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CTTAAGGAAG AAGAAGATAA AGCCGGGAGC GAGGCTGGAG ATAGTTTCCC AGCTGGCCAG 1567 

CTGGTliGAGG AGAGGTGACT AGAACCTGAC TGACTACTGC TCCTTCTAAT TTCACTGTCC 16 27 

CTGAAAGATG CCCATCAGCC TGGGATTCGC TGATGGAAGA ACTGCAAAGA CACGCAGCAG 1687 

AGAGAAGTCT GGCTGACAAC AGATTTAGTA CTGACC AGCT GATTTTTGTG GGCAGAAATT 1747 

TGAACTTGCT GCCTGCTGAG TCCAGTAGTT GTGC AGGGAG TGAGATGGCA GTGTTTAAGT 1807 

TTTGATTTGT AGTTTTTTGT TTTTGTCTCT CCCCTCTCCA GTGTTGGGGA TTGACCCCAG 1867 

GGCAAAGGCA TTAAGTGTGC CACTGACCTG TGCCTCCAAG TGATGTTCTG ACAGCCTTTC 1927 

TGAGGCAATC AATTGAA T T G A G G T T T T G G G AGAAGAAACT GTTGTTC ATA G G C T A T T 1 CI 19 8 7 

ATTTTAAAAG ATGTGAAGAG AAAAAAAAAA CAATAAAATT ATAAAA 2033 

( 2 ) INFORMATION FOR SEQ ID NO: 10: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 378 amino acids 
( B ) TYPE: amino acid 
( D ) TOPOLOGY: linear 

( i i ) MOJ.ECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met G 1 u lie His Gly Leu Ala Lys Leu lie Ala Asp Vat Ala Pro Ser 
1^5 10 15 

Ala lie Arg Glo Asn Asp lie Lys Ser Tyr Phe Gly Arg Lys Val Ala 
2 0 2 5 3 0 

lie Asp Ala Ser Met Ser lie Tyr Gin Phe Leu lie Ala Val Arg Gin 
3 5 4 0 4 5 

Gly Gly Asp Val Leu Gin Asn Glu Glu Gly Glo Tbr Thr Ser Leu Met 
5 0 5 5 6 0 

Gly Met Phe Tyr Arg Thr Met Arg Met Glu Asn Gly lie Lys Pro Val 

65 70 75 80 

Tyr Val Phe Asp Gly Lys Pro Pro Gin Leu Lys Ser Gly Glu Leu Ala 
8 5 9 0 9 5 

Lys Arg Ser Glu Arg Arg Ala Glu Ala Glu Lys Gin Leu Gin Gin Ala 
10 0 10 5 110 

Gin Gin Ala Gly Met Glu Glu Glu Val Glu Lys Phe Thr Lys Arg Leu 
115 12 0 12 5 

Val Lys Val Thr Lys Gin His Asn Asp Gin Cys Lys His Leu Leu Ser 

1 3 0 1 3 5 1 4 0 

Leu Met Gly lie Pro Tyr Leo Asp Ala Pro Ser Glu Ala Glu Ala Scr 
145 150 155 160 

Cys Ala Ala Leu Ala Lys Ala Gly Lys Val Tyr Ala Ala Ala Thr Glu 
1 6 5 1 7 0 1 7 5 

Asp Met Asp Cys Leu Thr Phe Gly Scr Pro Val Leu Met Arg His Leu 
180 185 190 

Thr Ala Ser Glu Ala Lys Lys Leu Pro lie Gin Glu Phe His Leu Ser 
195 * 200 205 

Arg Val Leu Gin Glu Leu Gly Leu Asn Gin Glu Gin Phe Val Asp Leu 

2 1 0 2 1 5 2 2 0 

Cys lie Leu Leu Gly Scr Asp Tyr Cys Glu Ser lie Arg Gly lie Gly 
225 230 235 240 

Ala Lys Arg Ala Val Asp Leu lie Gin Lys His Lys Ser lie Glu Glu 
245 250 255 

lie Val Arg Arg Leu Asp Pro Ser Lys Tyr Pro Vat Pro Glu Asn Trp 
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Leu His Lys Glu Ala Gin Gin Leu Phe Leu Glu Pro Glu Val Vol Asp 

275 280 285 

Pro Gtu Ser Val Glu Leu Lys Trp Scr Glu Pro A a n Glu Glu Glu Leu 
290 295 300 

Val Lys Pbc Met Cys Gly Giu Lys Gin Pbe Phe Glu Glu Arg tie Arg 
305 310 315 320 

Scr Gly Val Lys Arg Leu Scr Lys Scr Arg Gin Gly Ser Thr Gin Gly 
325 330 335 

Arg Leu Asp Asp Phe Phe Lys Val Thr Gly Ser Leu Ser Scr Ala Lys 
340 34 5 350 

Arg Lys Glu Pro Glu Pro Lys Gly Pro Ala Lys Lys Lys Ala Lys Thr 
355 360 365 

Gly Gly Ala Gly Lys Phe Arg Arg Gly Lys 

3 7 0 3 7 5 

( 2 ) INFORMATION FOR SEQ ID NO:ll: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 30 base pairs 
( B ) TYPE; nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

GGACTCTGCC TCAAGACGGT AGTCAACGTG 

( 2 ) INFORMATION FOR SEQ ID NO:12: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 13 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Gin Lys Arg Glu Ser Alo Lys Ser Thr Ala Arg Ala Arg 
1 5 10 



( 2 ) INFORMATION FOR SEQ ID NO:13: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 26 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



( 2 ) INFORMATION FOR SEQ II> NO:14: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 28 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
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T T T A T TTTCC C C T T T T A A A C TTCCC T G C 

( 2 ) INFORMATION FOR SEQ ID NO:15: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 22 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

lie Gin Gly Leu Ala Lys Leu lie Ala Asp V a I Ala Pro Ser Ala lie 
1 5 10 15 

Arg Glu Asu Asp lie Lys 
2 0 

( 2 ) INFORMATION FOR SEQ ID NO: 16: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 16 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOIJiCULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Ser Met Ser lie Tyr Gin Phe Lea lie Ala Val Arg Gin Gly Gly Asp 
1 5 10 1 5 



( 2 ) INFORMATION FOR SEQ ID NO: 17: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 22 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Tbr Ser His Leu Met Gly Met Phc Tyr Arg Thr lie Arg Met Met Glo 
1 5 10 15 

Asn Gly lie Lys Pro Val 

2 0 



( 2 ) INFORMATION FOR SEQ ID NO: 18: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 24 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Gly Lys Pro Pro Gin Leu Lys Scr Gly Glu Leu Ala Lys Arg Ser Glu 
1 5 10 15 

Arg Arg Aia Glu Ala Glu Lys Gin 

2 0 



( 2 ) INFORMATION FOR SEQ ID NO: 19: 



( i ) SEQUENCE CHARACTERISTICS: 
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< A ) LENGTH: 20 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

Glu Gin Gl u Val Gin Lys Phc Thr Lys Arg Leo Val Lys Val Thr Lys 
I 5 10 15 

Gin His Asn Asp 
2 0 

( 2 ) INFORMATION FOR SEQ ID NO:20: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 25 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Leu Leu Ser Leu Met Gly lie Pro Ty r Lea Asp Ala Pro Scr Glu Ala 
1 5 1 0 15 

Glu Ala Ser Cys Ala Ala Leu Val Lys 

2 0 2 5 

( 2 ) INFORMATION FOR SEQ ID NO:21: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) IJiNGTH: 23 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECUUi TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

Leu Tbr Phc Gly Ser Pro Val Len Met Arg His Leu Thr Ala Ser Glu 
1 5 10 15 

Ala Lys Lys Leu Pro lie Gin 
2 0 



( 2 ) INFORMATION FOR SEQ ID NO:22: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 21 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

lie Leu Gin Glu Leu Gly Leu Asn Gin Glu Gin Pbe Val Asp Leu Cys 
1 5 10 15 

lie Leu Leu Gly Ser 

2 0 



( 2 ) INFORMATION FOR SEQ ID NO:23: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 24 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



83 



5,874,283 

-continued 



84 



( i i ) MOLECUUi TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Arg Gly lie Gly Pro Lys Arg Ala Val Asp Leo lie Gin Lys His Lys 
1 5 10 15 

Ser lie Gin Glu lie Val Arg Arg 
2 0 



( 2 ) INFORMATION FOR SEQ ID NO:24: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 20 amino acids 

< B ) TYPE: amino acid 

( C ) STRANDEDNESS : single 

< D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x 5 ) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Pro Glu Asn Trp Leu His Lys Glu Ala His Gin Leu Phe Leu Glu Pro 
1 5 10 15 

Glu Val Leu Asp 
2 0 



( 2 ) INFORMATION FOR SF,Q ID NO:25: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 22 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Trp Ser Glu Pro Asn Glu Glu Glu Leu lie Lys P li c Met Cys Gly Glu 
1 5 10 15 

Lys Gin Phe Scr Glu Glu 
2 0 



( 2 ) INFORMATION FOR SEQ ID NO:26: 

( i ) SEQUENCE CHARACTERISTICS: 

< A ) LENGTH: 22 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:26: 

Ser Lys Ser Arg Gin Gly Ser Thr Gin Gly Arg Leu Asp Asp Phe Phe 
1 5 10 15 

Lys Val Thr Gly Ser Leu 
2 0 



( 2 ) INFORMATION FOR SEQ ID NO:27: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 16 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: peptide 



( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:27: 
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l.y s G I u Pro O I u Pro L y s O I y S e r I' li r L y s L y s L y s A 1 o Lys T h r G I y 
1 5 10 15 

( 2 ) INFORMATION FOR SEQ ID NO:28: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1144 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (polynucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ATGCGAATTC AAGGCCTGGC CAAACTAATT GCTGATGTGG CCCCCAGTGC CATCCGGGAG 60 

AATGACATCA AGAGCTACTT TGGCCGTAAG GTGGCCATTG ATGCCTCTAT GAGCATTTAT 120 

CAGTTCCTGA TTGCTGTTCG CCAGGGTGGG GATGTGCTGC AGAATGAGGA GGGTGAGACC 180 

ACCAGCCACC TGATGGGCAT GTTCTACCGC ACCATTCGCA TGATGGAGAA CGGCATCAAG 240 

CCCGTGTATG TCTTTGATGG CAAGCCGCCA CAGCTCAAGT CAGGCGAGCT GGCCAAACGC 300 

AGTGAGCGGC GGGCTGAGGC AGAGAAGCAG CTGCAGCAGG CTCAGGCTGC TGGGGCCGAG 360 

CAGGAGGTGC AAAAATTCAC TAAGCGGCTG GTGAAGGTCA CTAAGCAGCA CAATGATGAG 420 

TGCAAACATC TGCTGAIJCCT CATGGGCATC CCTTATCTTG ATGCACCCAG TGAGGCAGAG 4 8 0 

GCCAGCTGTG CTGCCCTGGT GAAGGCTGGC AAAGTCTATG CTGCGGCTAC CGAGGACATG 540 

GACTGCCTCA CCTTCGGCAG CCCTGTGCTA ATGCGACACC TGACTGCCAG TGAAGCCAAA 600 

AAGCTGCCAA TCCAGGAATT CCACCTGAGC CGGATTCTGC AGGAGCTGGG CCTGAACCAG 660 

GAACAGTTTG TGGATCTGTG CATCCTGCTA GGCAGTGACT ACTGTGAGAG TATCCGGGGT 720 

ATTGGGCCCA AGCGGCCTGT GGACCTCATC CAGAAGCACA AGAGCATCGA GGAGATCGTG 780 

CGGCGACTTG A C C C C A A C A A GTACCCTGTG CCAGAAAATT GGCTCCACAA GGAGGCTCAC 8 4 0 

CAGCTCTTCT TGGAACCTGA GGTGCTGGAC CCAGAGTCTG TGGAGCTGAA GTGGAGCGAG 900 

CCAAATGAAG AAGAGCTGAT CAAGTTCATG TGTGGTGAAA AGCAGTTCTC TGAGGAGCGA 960 

ATCCGCAGTG GGGTCAAGAG GCTGAGTAAG AGCCGCCAAG GCAGCACCCA GGGCCGCCTG 1020 

GATGATTTCT TCAAGGTGAC CGGCTCACTC TCTTCAGCTA AGCGCAAGGA GCCAGAACCC 1080 

AAGGGATCCA CTAAGAAGAA GGCAAAGACT GGGGCAGCAG GGAAGTTTAA AAGGGGAAAA 1140 

T A A A 114 4 

( 2 ) INFORMATION FOR SEQ ID NO:29: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 45 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

TGGGAATTCA AGGCCTGGCC AAACTAATTG CTGATCTGGC CCCCA 45 

( 2 ) INFORMATION FOR SEQ ID NO:30: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) M (MACULE TYPE: DNA (oligonucleotide) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TCACATCAAC AGCTACTTTG GCCGTAACGT CGCCA 35 



( 2 ) INFORMATIOX FOR SEQ ID NO:31: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 37 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

TGCCTCTATG AGCATTTATC AGTTCCTGAT TGCTGTT 37 



( 2 ) INFORMATION FOR SEQ ID NO:32: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 33 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 3 2: 

GGATGTGCTG CAGAATGACG AGGGTGAGAC CAC 33 



( 2 ) INFORMATION FOR SEQ ID NO:33: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 39 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY* linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TGGGCATGTT CTACCGCACC ATTCGCATGA TGGAGAACG 3 9 



( 2 ) INFORMATION FOR SEQ ID NO:34: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) I JiNGTH: *1 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

CTTTGATGGC AAGCCGCCAC AGCTCAAGTC AGGCGAGCTG G 41 



( 2 ) INFORMATION FOR SEQ ID NO:35: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 32 base pairs 
( B ) TYPE: noclcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 3 5: 
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AGCAGCTGCA GCAGGCTCAG GCTGCTGGGG CC 

( 2 ) INFORMATION FOR SEQ ID NO:36: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE; nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

AATTCACTAA GCGGCTGGTG AAGGT CACTA AGCAG 

( 2 ) INFORMATION FOR SEQ ID NO:37: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 32 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

ATGATGAGTG CAAACATCTG CTGAGCCTCA TG 

( 2 ) INFORMATION FOR SEQ ID NO:38: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 37 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIFnON: SEQ ID NO:38: 

ATCCCTTATC TTGATGCACC CAGTGAGGCA GAGGCCA 

( 2 ) INFORMATION FOR SEQ ID NO:39: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 44 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOIJiCUIJi TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:39: 

GCCCTGGTGA AGGCTGGCAA AGTCTATGCT GCGGCTACCG AGGA 

( 2 ) INFORMATION FOR SEQ ID NO:40: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 33 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:40: 

CTTCGGCAGC CCTGTGCTAA TGCGACACCT G A C 



( 2 ) INFORMATION FOR SEQ ID NO:4l: 



• * 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) UiNGTH: 36 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

CAGGAATTCC ACCTGAGCCG GATT CTGCAG GAGCTG 

( 2 ) INFORMATION FOR SEQ ID NO:42: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 36 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

CCTGAACCAG GAACAGTTTG TGGATCTGTG CATCCT 

( 2 ) INFORMATION FOR SEQ ID NO:43: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 41 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRlPItON: SEQ ID NO:43: 

AGGCAGTGAC TACTGTGAGA GTATCCGGGG TATTGGGCCC A 

( 2 ) INFORMATION FOR SEQ ID NO:44: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 39 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x j ) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GGCTGTGGAC CTCATCCAGA AGCACAAGAG CATCGAGGA 

( 2 ) INFORMATION FOR SEQ ID NO:45: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pahs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRI PfION: SEQ ID NO:45: 

CAAGTACCCT CTCCCACAAA ATTGGCTCCA CAAGGAGGCT 

( 2 ) INFORMATION FOR SEQ ID NO:4o: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 38 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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(it) MOliiCULE TYPE: DNA (oligonucleotide) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
CTGAGGTGCT GGACCCAGAC TCTGTGGAGC TGAAGTGG 

( 2 ) INFORMATION FOR SEQ ID NO:47: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 41 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

GATCAAGTTC ATGTGTGGTG AAAAGCAGTT CTCTGAGGAG C 

( 2 ) INFORMATION FOR SEQ ID NO:48: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 38 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

ATCCGCAGTG GGGTCAAGAG GCTGAGTAAG AGCCGCCA 

( 2 ) INFORMATION FOR SEQ ID NO:49: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 32 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

GCAGCACCCA OGGCCfiCCTG GATGATTTCT T C 



( 2 ) INFORMATION FOR SEQ ID NO:50: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 34 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:5ft 



( 2 ) INFORMATION FOR SEQ ID NO:51 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 41 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
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CCCAAGGGAT ccactaagaa gaaggcaaag actggggcag c 

( 2 ) INFORMATION FOR SEQ ID NO: 52: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE; nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: Linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:52: 

CACGTTGACT GAATC 

( 2 ) INFORMATION FOR SEQ ID NO:53: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECUIJi TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

ACCGTCTTGA GGCAGAGT 

( 2 ) INFORMATION FOR SEQ ID NO: 54: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 30 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

GGACTCTGCC TCAAGACGGT AGTCAACGTG 

( 2 ) INFORMATION FOR SEQ ID NO:55: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 34 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECUIJi TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

CATGTCAAGC AGTCCTAACT TTGAGGCAGA GTCC 

( 2 ) INFORMATION FOR SEQ ID NO: 56: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 16 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
C D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

CACGTTGACT ACCGTC 



( 2 ) INFORMATION FOR SEQ ID NO:57: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

(lij SEQUENCE DESCRIPTION: SEQ ID NO:57: 

GTAGGAGATG TCCCTTGATG AATTC 

( 2 ) INFORMATION FOR SEQ ID NO:58: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 16 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

CAGCAACGCA AGCTTG 

( 2 ) INFORMATION FOR SEQ ID NO:59: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SF^ ID NO:59: 

TAGCAGGCTG CAGGTCGAC 

( 2 ) INFORMATION FOR SEQ ID NO:60: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 30 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i j ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:60: 

GTCGACCTGC AGCCCAAGCT TGCGTTGCTG 

( 2 ) INFORMATION FOR SEQ ID NO:61: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x > ) SEQUENCE DESCRIPTION: SEQ ID NO:61: 

AGGCTGCAGG TCGAC 




( 2 ) INFORMATION FOR SEQ IDff«):62: 

( i ) SEQUENCE CHARAdBTlSTICS: 
( A ) LENGTH: 33 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



